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ABSTRACT 


Volume  VIII  deals  with  the  following  topics: 

1.  Application  of  Distribution  -  Free  Tolerance  Regions  to  Pattern 
Recognition 

Pattern  recognition  is  needed  to  Identify  sonar  signatures  as  to 
the  type  of  target  by  which  they  are  generated.  Distribution  -  free 
methods  are  desirable  in  this  context  since  the  probability  distributions 
underlying  the  pattern  classes  are  frequently  unknown,  and  it  ia  desirable 
to  establish  some  upper  bound  on  at  least  the  expected  false-alarm  proba¬ 
bility.  The  recognition  method  developed  has  actually  been  applied  to 
the  recognition  of  speech  wave  forms  since  these  are  more  easily  obtainable 
than  sonar  signatures  yet  possess  some  of  the  same  characteristics. 

L .  Passive  Detection  and  Tracking  using  Surface  Scattered  Signals 

Signals  reflected  from  irregular  time  varying  boundaries  such  as  the 
sea  surface  undergo  distortion  which  limits  their  detectability  and  use- 
ability  for  tracking.  The  properties  of  this  distortion  for  correlator 
processing  are  related  to  the  statistical  constraints  placed  upon  the  time 
variation  and  irregularity  of  the  boundary.  Two  propagation  geometries 
are  analysed.  The  first  deals  with  the  crosscorrelation  of  surface 
reflected  and  direct  transmission  paths,  and  the  second  with  the  cross¬ 
correlation  of  surface  scattered  signals  received  at  two  different  locations. 
This  analysis  assumes  that  the  signal  generated  at  the  target  and  the 
bacKground  noise  are  both  gaussian  random  variables.  Three  models  of  the 
scattering  mechanism  are  proposed  and  two  are  analysed  in  detail.  In  all 
cases  the  correlator  output  is  shown  to  exhibit  very  persistent  fluctuations 
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due  to  the  scattering.  The  existence  of  these  fluctuations  is  related  to 
the  non-gaussian  nature  of  the  scattered  signals.  The  fourth  order 
cuoulant  is  computed  to  show  that  well  spaced  scattered  signal  samples 
may  be  dependent  even  when  they  are  uncorrelated.  Results  are  presented  for 
low  pass  signal  spectra  and  are  Investigated  as  a  function  of  bandwidth. 
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FOREWORD 


This  is  the  eighth  in  a  series  of  reports  describing  work  performed  by  Yale  University 
under  a  subcontract  with  Electric  Boat  division  of  General  Dynamics,  prime  con¬ 
tractor.  The  Office  of  Naval  Research  is  sponsor  of  this  research  under  Contract 
N00014-68-C-0392  (Contract  Authority  Identification  Number  NR  286-001-1).  LCDR 
J.  F.  Lyding  is  Project  Officer  for  ONR.  Mr.  J.  W.  Herring  is  Project  Engineer 
for  Electric  Boat  division  under  the  direction  of  Dr.  A.  J.  van  Woerkom,  Manager 
of  Scientific  Research. 
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1. _ Introduction 


This  report  is  the  second  of  two  volumes  dealing  with  work  completed 
under  contract  8050-31-55001  between  Yale  University  and  the  Electric  Boat 
Company  during  the  period  from  July  1,  1968  to  Aoril  30,  1970.  More 
detailed  discussions  of  the  results  are  contained  in  the  two  progress 
reports  Nos.  42  and  43  which  are  appended.  The  companion  volume  (Vol.  VII 
of  this  series)  covers  work  done  during  the  same  time  period  and  contains 
results  submitted  originally  in  progress  reports  No.  38  through  41.  The 
present  volume  is  concerned  with  pattern  recognition  and  detection,  of 
surface  scattered  signals,  and  it  therefore  represents  something  of  a 
departure  from  previous  work,  where  the  emphasis  was  mainly  a  signal 
processing. 

The  interest  in  pattern  recognition  arose  initially  from  the  desire 
of  identifying  target  types  from  their  sonar  signatures,  i.e.  to  determine 
whether  a  received  signal  was  generated  by  a  ship,  or  a  submarine,  or 
possibly  a  school  of  fish.  Pattern  recognition  is  however  still  a  rather 
inexact  discipline  relying  rather  heavily  on  ad  hoc  procedures.  Hence 
the  approach  taken  in  a  given  case  depends  very  strongly  on  the  nature  of 
the  application,  and  in  order  to  apply  pattern  recognition  techniques  to 
sonar  signature  discrimination  it  would  have  been  necessary  to  have  had  on 
hand  representative  samples  of  sonar  signals.  These  proved  to  be  not 
easily  available,  and  it  was  decided  therefore  to  transform  the  recognition 
problem  into  a  speech  recognition  problem,  on  the  supposition  that  speech 
waveforms  would  be  roughly  equivalent  to  sonar  waveforms.  This  kind  of 
equivalence  would  of  course  exist  only  for  signals  from  single  hydrophones 
and  information  contained  in  the  spatial  distribution  of  sonar  signals 
from  different  types  of  targets  is  therefore  discarded.  An  initial  attempt 
at  signature  discrimination  would  however  probably  not  have  included 
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spatial  properties  in  any  case,  since  these  call  for  entirely  different 
approaches  some  of  which  are  being  considered  in  current  research. 

Surface-scattered  signals  must  be  considered  in  sonar  detection 
and  communication  systems  because  in  many  cases  signals  will  be  trans¬ 
mitted  to  the  receiving  array  not  only  by  the  direct  path  but  also  by 
reflection  from  the  surface  (and  the  bottom).  In  fact  under  certain 
shadowing  conditions  the  surface-reflected  path  may  be  the  only  one  trans¬ 
mitting  significant  energy.  In  past  work  the  characteristics  of  the 
propagation  path  have  been  largely  ignored;  i.e.  only  the  most  elementary 
propagation  models  were  used.  While  many  important  and  valid  results  were 
obtained  this  way  it  has  always  been  clear  that  many  other  effects  observed 
in  sonar  systems  could  only  be  analyzed  by  considering  more  sophisticated 
models.  The  only  previous  effort  in  this  direction  is  contained  in  Progress 
report  No.  13  (Appendix  F  of  Volume  II)  where  the  effect  of  volume 
inhoraogeneities  in  producing  errors  in  the  bearing  estimate  were  considered. 
The  surface  scattering  path  studied  in  the  present  volume  represents 
another  effort  at  a  more  realistic  characterization  of  the  propagation  path. 

II.  Application  of  Distribution-Free  Tolerance  Regions  to  Pattern 
Recognition 

As  noted  in  the  introduction,  the  sonar  signature  classification 
problem  has  been  converted  into  a  speaker  recognition  problem.  The 
formulation  of  the  problem  dealt  with  in  progress  report  No.  42  is  that 
the  system  is  to  recognize  a  main  speaker  with  a  fixed  expected  false 
alarm  probability.  Any  test  speaker  who  is  not  the  main  speaker  is  con¬ 
sidered  to  be  an  impostor  and  a  false  alarm  is  defined  as  the  error 
committed  when  an  impostor  is  classified  as  the  main  speaker.  In  addition 
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to  fixing  the  probability  of  falsa  alarm  the  system  should  correctly 
recognise  the  main  speaker  ms  often  ms  possible. 

It  is  assumed  that  the  probability  distributions  governing  the  class 
distributions  are  unknown.  Hence  with  a  finite  s&mple  else  it  is  imposs¬ 
ible  to  make  precise  statements  about  any  of  the  error  probabilities.  The 
method  of  Distribution-Free  Tolerance  Regions  makes  it  possible  to  fix  the 
expected  probability  of  one  kind  of  error,  here  chosen  to  be  the  false  alarm. 
It  does  not  guarantee  anything  about  the  errors  of  the  other  kind  - 
different  choices  of  tolerance  regions  resulting  in  different  error  rates 
in  any  given  sample.  Thus  the  best  that  can  be  done  is  to  select  a  method 
that  appears  to  have  desirable  properties  and  can  therefore  be  expected  to 
do  a  good  job  of  maximizing  the  probability  of  correct  classification. 

It  is  assumed  that  much  of  the  information  for  the  recognition  of 
speakers  is  contained  in  the  transition  between  phonemes  as  well  as  in  the 
phonemes  themselves.  Fcr  this  reason  a  simple  word  which  contained  a 
dip thong  was  analyzed  by  calculating  many  short-term  spectra  over  the 
length  of  the  word.  These  spectra  were  used  to  form  the  measurement  space 
in  which  the  declaim  regions  were  constructed. 

Recordings  of  225  utterances  by  each  of  three  speakers,  25  utterances 
by  each  of  26  impostors,  and  10  utterances  by  each  of  30  Impostors  were 
used.  These  utterances  consisted  of  the  sentence  "My  code  is  -  ".  digitized 
into  10  bit  accuracy  at  8000  samples  per  second.  Only  the  word  "my"  was 
actually  analyzed. 

A  short-term  spectrum  was  calculated  from  256  samples  of  the  wave¬ 
form  by  a  fast  Fourier  transform.  Hence,  each  spectrum  consisted  of  128 
unique  frequency  components.  The  questions  of  1)  how  many  spectra  to  use 
and  2)  how  coarse  each  spectrum  should  be  were  investigated  by  forming 
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3-different  256-dlmensional  measurement  spaces.  The  first  measurement 
space  was  made  up  of  4  spectra  with  each  spectrum  having  64  frequency 
components.  The  second  space  consisted  of  8  spectra  with  each  spectrum 
having  32  components.  The  third  space  consisted  of  16  spectra  with  each 
spectrum  having  16  components.  The  number  of  frequency  components  per 
spectrum  was  reduced  from  128  to  64,  32,  and  16  by  simple  averaging. 

The  amplitude  of  each  spectrum  was  normalized  to  make  the  energy  content 
of  the  word  constant.  The  length  of  the  word  "my"  varied  from  900  time 
samples  (approximately  110  msec)  to  3350  time  samples  (approximately  420  ms) 
according  to  the  particular  speaker  and  the  particular  utterance  Involved. 
Typical  variation  of  the  length  of  '’my"  by  the  same  speaker  was  from  1500 
to  2200  time  samples.  This  variation  was  normalized  by  placing  the  spectra 
uniformly  across  the  word  "my".  Therefore,  in  the  case  of  the  typical 
speaker  eight  spectra  would  approximately  cover  the  word  with  little  If 
any  overlap. 

In  the  method  of  Distribution  Free  Tolerance  Regions  (DFTR)  the 
sample  space  is  separated  Into  statistically  equivalent  blocks  by  means 
of  a  set  of  ordering  functions.  The  general  procedure  Is  described  In 
chapter  2  of  progress  report  No.  42,  and  a  more  detailed  description  of 
the  ordering  functions  used  In  the  speaktY-verificatlon  experiment  Is 
given  In  chapter  3.  The  union  of  a  certain  number  of  these  blocks  forms 
the  acceptance  region  R^;  if  a  new  sample  falls  into  this  region  it  Is 
classified  as  being  a  number  of  class  A  (here  taken  to  be  the  class  of 
main-speaker  samples).  If  the  number  of  main  speaker  training  samples  Is 
nfl  and  if  m  is  the  number  of  statistically  equivalent  blocks  combined  to 
form  the  acceptance  region  R^  then 

m  •  a(n  +  1) 

S 
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where  a  it  the  expected  value  of  false  alarm  probability  that  is  to  be 
achieved.  By  choeing  the  ordering  functions  to  be  hyperspheres  expanding 
from  each  one  of  the  n  main-speaker  training  samples  one  is  assured  that 
at  least  all  of  the  training  samples  lie  in  the  region  R^.  Hopefully 
this  procedure  will  therefore  do  well  on  main-speaker  test  samples  as  well. 

The  ordering  functions  that  are  combined  to  form  the  acceptance 
region  R^  are  formed  by  ordering  the  impostor  training  samples.  By  order¬ 
ing  the  main-speaker  samples  as  well  an  estimate  of  the  correct  classi¬ 
fication  can  be  obtained.  The  expected  value  of  the  probability  of  correct 
classification  obeys  the  inequality 


*  b+T 

where  b  is  the  number  of  complete  blocks  in  region  that  can  be  formed 
by  ordering  the  main-speaker  samples. 

To  test  the  system  AO  of  the  225  available  main-speaker  samples  and 
208  of  the  6500  impostor  samples  were  used  as  training  samples.  The 
acceptance  region  was  composed  of  seven  blocks  giving  an  expected  false 
alarm  probability  of  7/209  or  .0335.  A  summary  of  the  major  results  is 
given  in  table  A. A  of  progress  report  A2. 

The  measured  false  alarm  rate  for  several  different  ways  of  forming 
the  acceptance  regions  is  generally  within  one  standard  deviation  of  the 
expected  value  of  .0335.  Also  the  probability  of  miss  has  roughly  the 
same  order  of  magnitude  and  turned  out  to  be  slightly  better  for  the  sample 
space  made  up  of  eight  short-time  spectra,  32  frequency  components  per 
spectrum,  than  for  the  other  sample  spaces. 

The  procedure  was  compared  with  the  conceptually  much  simpler 
nearest-neighbor  method,  «,here  a  test  point  is  classified  according  to  the 
class  of  the  nearest  training  point.  The  nearest-neighbor  method  Involves 
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much  more  computation  time  than  the  DFTF  method  end  the  sample  space  was 
therefore  arbitrarily  reduced  to  48  dimensions.  It  vas  found  that  the 
reduction  in  dimensionality  increased  the  miss  rate  by  almost  a  factor  of 
10  for  the  same  expected  false  alarm  rate  in  the  DFTR  method.  The  nearest- 
neighbor  method  shows  a  much  smaller  error  rate  than  the  DFTR  method;  this 
la  to  be  expected  since  it  utilizes  Information  about  all  the  samples  while 
the  DFTR  method  only  uses  information  about  samples  that  have  been  ordered. 

On  the  other  hand  the  nearest  neighbor  method  cannot  be  set  up  to  guarantee 
a  specified  expected  false  alarm  rate,  and  it  also  takes  much  more  compu¬ 
tation  time.  Thus  if  computation  time  is  a  factor  the  DFTR  method  is 
definitely  superior.  The  DFTR  method  is  easy  to  program,  and  once  the 
system  is  trained  checking  out  a  new  test  sample  only  takes  a  few  seconds 
of  IBM7G94  time.  The  system  therefore  appears  to  have  practical  usefulness. 

III.  Passive  Detection  and  Tracking  UainR  Surface  Scattered  Signals 

The  major  effect  of  surface  scattering  considered  in  Progress  report 
No.  43  is  the  decorrelation  produced  in  the  signals  received  by  pairs  of 
hydrophones.  A  system  consisting  of  two  hydrophones  is  therefore  postulated, 
and  it  is  assumed  that  the  hydrophone  signals  are  processed  by  a  simple 
cross-correlator  as  shown  in  either  Fig.  3,1-1  or  3.4-1  of  Progress  report 
No.  43. 

Three  different  propagation  models  are  considered.  The  first 
considers  a  direct  channel  to  one  of  the  hydrophones  and  a  surface  scatter¬ 
ing  channel  modelled  as  a  random  amplitude  and  delay  model  to  the  second 
hydrophone.  The  transfer  f miction  for  the  direct  channel  is 

-jwR/c 

V“>  ■  Hr- 

where  R  is  the  line-of-sight  distance  from  target  to  receiver  and  c  is  the 
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speed  of  sound.  The  transfer  function  for  the  scattering  channel  Is  time 
varying  and  has  the  form 

Hs(w  *)  -  A(t)e“ju)T(t) 

where  the  amplitude  function  A(t)  and  the  delay  function  t(t)  are  considered 
to  be  Independent  stationary  Gaussian  random  processes  whose  variation  is 
slow  relative  to  the  signal  bandwidth. 

In  the  second  model  both  channels  between  the  transmitter  and  the 
two  hydrophones  are  assumed  to  be  random  amplitude  and  delay  models  with 

-  A1(t)e'3“Tl(t)i  Hj (u»t)  -  A2(t)e*:)“T2<t) 

The  amplitude  function  A^(t)  and  the  delay  function  T^(t)  are  assumed  to 
be  jointly  independent  stationary  Gaussian  random  processes,  as  are  A^Ct) 
and  T^Ct);  however  A^(t)  and  Ag(t)  are  jointly  dependent,  as  are  x^(t)  and 
t2(t). 

In  the  third  model  one  direct  channel  and  one  surface  scattering 
channel  are  again  assumed,  but  now  the  surface  is  itself  modelled  as  a 
random  sine  wave  of  the  form 

(x,y,t)  •  h(t)sinlq  xcoso  +q  ysina  -  fit  -  x(t)l 

9  8  9  8  S 

where  h(t)  and  x(t)  are  random  wavehaight  and  positional  phase  parameters 
that  are  supposed  to  be  very  slowly  varying.  The  parameters  q  ,  a  ,  and  Q  , 
are  the  magnitude  and  orientation  of  the  propagation  vector  and  temporal 
frequency  of  the  surface  respectively. 

In  all  cases  the  signal  x(t)  at  the  transmitter  is  assumed  to  be  a 
stationary  zero  -  mean  Gaussian  random  process  having  a  power  spectral 
density  S  (to) .  Noise  signals  n.  (t)  and  n9(t)  are  assumed  to  add  to  the 
signals  entering  the  hydrophones;  these  are  wide-range  stationary  Gaussian 
random  processes  that  are  independent  from  the  signal  x(t),  but  not 
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necessarily  Jointly  Independent. 

It  Is  to  be  noted  that  in  ell  cases  a  Gaussian  signal  Is  operated 
upon  by  a  random  channel  function;  It  Is  therefore  not  suprising  that  the 
signal  received  by  the  hydrophones  is  no  longer  gausslan.  As  a  formal 
demonstration  of  this  fact  the  fourth-order  cumulant  of  the  received  signal 
has  been  computed  and  turns  out  to  be  non  zero  under  several  different  input 
conditions,  and  for  all  three  of  the  assumed  scattering  models.  (See  fig. 
4.4-1). 

The  correlator  output  signal  is  denoted  by  S(t,T,p)  where  t  is  the 

delay  Introduced  in  one  of  the  channels  to  "steer"  the  two-hydrophones  in 

the  direction  of  the  target,  T  is  the  averaging  time,  and  p  is  the  time  at 

which  averaging  starts.  In  the  absence  of  scattering  and  of  noise  E  would 

show  a  sharp  peak  at  t  ■  t  ,  where  t  is  the  "correct"  steering  delay.  As 

oo 

a  result  of  scattering  and/or  noise  the  location  of  the  peak  becomes  a 
random  variable  depending  on  the  Instantaneous  scattering  conditions  during 
the  averaging  period  T,  and  the  height  and  sharpness  of  the  peak  are  also 
reduced  so  that  under  severe  scattering  no  clear  peak  is  discernible.  (See 

Fig.  4.2-4). 

Several  criteria  may  be  employed  to  evaluate  the  performance  of  the 
correlator.  One  of  these  is  to  form  the  likelihood  ratio  of  the  correlator 
output  for  t  -  x  and  to  assume  that  a  signal  is  present  (hypothesis  H.  is 

O  X 

ture)  if  the  likelihood  ratio  exceeds  some  threshold;  otherwise  hypothesis 

H  -  noise  only  -  is  assumed  to  be  correct.  For  reasonable  integration 

o 

2 

times  £  is  approximately  Gaussian  with  zero  mean  and  variance  if  Hq  is 

2 

true,  and  with  a  mean  and  variance  o^  if  is  true.  It  is  then  a 
straight  forward  matter  to  compute  the  false  alarm  and  miss  probabilities, 
and  this  is  done  ir  Eqs.  (3.8-10)  and  (3.8-11).  The  two  error  probabilities 
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are  seen  to  depend  only  on  the  two  normalised  standard  deviations  of  5 

defined  by  dQ  •  oq/v^  and  d^  »  o^/y^.  Unfortunately  the  dependence  is 

rather  complex  and  not  easily  visualised.  Furthermore  this  definition  of 

error  probabilities  is  somewhat  misleading  since  a  small  shift  of  the  peak 

could  result  in  a  very  marked  reduction  of  the  magnitude  of  S  at  t  ■  t  . 

o 

This  would  result  in  rejection  of  the  hypothesis  even  though  the  peak 
might  still  be  quite  clearly  discernible. 

Another  criterion  for  evaluating  the  system  may  be  obtained  by  con¬ 
sidering  the  location  of  the  peak  of  E  to  be  an  eatimate  of  the  true  value 
of  t.  Then  the  variance  of  this  location  would  be  an  indication  of  the 
accuracy  of  the  estimate.  If  it  is  assumed  that  there  is  only  a  single 
peak,  then  this  criterion  is  equivalent  to  computing  the  variance  of  the 

3 

zero  crossing  of  the  derivative  ;  specifically,  the  quantity  of  Interest 

3 

is  this  variance  normalized  with  respect  to  the  mean  slope  of  ^  at  t-to: 


t5'1 

3-’  ^ 

ilrJ 


T 

O 


Actually,  it  is  just  as  easy  to  compute  the  normalized  autocorrelation 

2 

function  of  E,  whose  value  for  zero  argument  is  then  o  .  A  general 

To 

expression  for  this  function,  denoted  byR  (v)48  given  in  Eq.  (A. 2-13); 

T 

O 

unfortunately  it  is  rather  complex. 

2 

As  a  third  criterion  the  normalized  variance  d^  may  itself  be  used 

2 

since  it  contains  essentially  the  same  information  as  a  .  A  general 

TO 

2 

expression  for  d^  is  given  in  Eq.  (4.2-8),  and  is  seen  to  be  quite  similar 

to  the  expression  for  R  (y) . 

o 

In  performing  the  computation  for  these  expressions  is  assumed  that 
signal  and  noise  spectra,  filter  transfer  functions,  and  the  spectra  of 
amplitude  and  delay  all  have  a  Gaussian  shape;  thus  the  signal  spectrum  is 
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given  by 


S 

xx 


(w)  - 


The  autocorrelation  function  c 2 


/2tT  P 

_ x 
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X 

the  random  delay  la 
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R„(u>  -  O*  «xp  [-  Ij  fl2  u2] 

etc.  (See  Eqe.  4.1-17,  4.1-18,  4.2-2,  and  4.2-3). 

Curves  for  d^  as  a  function  of  T  are  shown  in  Pies.  4.2-1,  4.2-2, 
4.2-3,  and  4.2-5.  These  curves  die  for  the  first  of  three  propagation 
models,  but  qualitatively  the  curves  for  the  other  two  models  are  similar. 
The  most  striking  feature  of  all  of  these  curves  is  a  well-defined  plateau 
over  which  decreases  only  very  little  as  T  increases.  For  values  of  T 
smaller  or  larger  than  the  plateau  values  the  dependence  of  dj  is  propor¬ 
tional  to  lvtf  as  might  be  expected  from  rather  general  statistical 
considerations . 

The  existence  of  the  plateau  is  another  indication  of  the  fact  that 
the  signals  received  by  the  hydrophones  is  non  Gaussian.  Qualitatively  the 
plateau  is  a  result  of  the  fluctuations  in  the  Instantaneous  estimate  of  t 
that  results  from  the  random  delays  produced  by  the  surface  scattering. 

For  small  integration  times  an  Increase  in  T  tends  to  eliminate  fluctuations 
due  to  noise  and  the  randomnerc  of  the  target  signal  and  to  produce  a  better 
definition  of  the  peak  in  the  output.  Hence  d^  decreases.  However  also 
measures  the  fluctuations  of  the  peak  that  result  from  scattering,  and  when 
T  has  become  large  enough  to  eliminate  noise  and  signal  effects  from  the 
peak,  the  fluctuation  due  to  scattering  still  persist.  Hence  remains 
essentially  constant  until  the  Integration  time  has  become  so  large  that  the 
scattering  fluctuation  are  also  being  'washed  out". 

For  the  first  propagation  model  (single  random  amplitude  and  delay 

2 

channel)  the  level  of  the  plateau  in  is 


1 


°A  2 

»♦(!*) 


JT~a 

/a  -o 
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where  Is  the  variance  of  tie  amplitude  fluctuation  A£  is  the  mean 

2 

amplitude  of  the  channel,  o is  the  variance  of  the  delay  fluctuation,  and 

„2  _  1  _  ,2  ,  1  .  1  ,  1 
®  n2  T  n2  2n2  2n2 

m  x  1  2 

Here  Q^is  the  bandwidth  of  the  signal  and  and  ^  are  the  bandwidths  of 
the  two  filters  H^(u>)  and  ^(u)  used  in  the  correlator.  Note  that  the 
plateau  will  exist  even  if  there  is  no  random  delay:  this  is  because 
measures  the  total  variation  of  the  output  peak,  not  only  its  motion  along 
the  t  axis.  The  plateau  does  disappear  if  both  o.  and  o  are  zero,  as  would 

A  T 

be  expected.  It  also  becomes  less  pronounced  as  the  sigr.al-to-noise  ratio 
decreases,  since  the  received  signal  consists  then  of  mostly  noise  and  tends 
to  be  Gaussian. 

A  similar  plateau  la  found  in  the  expression  of  Rt  (0)  which  is  a 

o 

better  indicator  of  the  tracking  error  than  d^.  The  plateau  here  is 

6 

o  a 
r  m 


•  m  t 


Note  that  this  plateau  disappears  when  *  0  since  amplitude  fluctuations 
do  not  affect  the  tracking  error  in  that  case. 

Similar  plateaus  are  found  in  the  other  propagation  models.  Expressions 
corresponding  to  the  two  given  here  are  Eqs.  4.4-5  and  4.4-8  respectively 
for  the  two-scattering  channel  model  and  Fqs.  5.3-12  and  5.3-15  for  the 
random  sine-  ave  model. 

In  the  random  sine-wave  model  the  height  of  the  plateau  can  be  related 
to  an  effective  Rayleigh  parameter. 

/2  o.  Sin  iji 


nf*/c 


fx 
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where  n h  is  the  variance  of  the  amplitude,  It  the  grating  angle,  c  In 

the  sound  velocity  in  water  end  ■  (1/Q*  +  l/(r  J  where  le  the 

filter  bandwidth  (as Burned  to  b2  lndentlcel  in  the  two  correlator  channel*) 

and  nx  is  the  signal  bandwidth,  suitably  defined  (See  Bq.  5.3-1).  The 

critical  value  of  this  parameter  for  both  dj  and  R  (0)  is  roughly  unity: 

o 

the  plateau  is  small  for  lesser  valuee,  and  rises  steeply  for  larger  values. 
It  is  interesting  to  note  the  appearance  of  the  Rayleigh  parameter  in  this 
context  since  it  is  generally  a  good  measure  of  relative  surface  roughness. 

The  height  of  the  plateau  can  be  reduced  and  the  performance  of  the 
system  improved  by  reducing  the  filter  bandwidth  fl^.  It  is  clear  from 
the  expression  for  o  (where  fl,  -  -  tt.)  that  this  makes  a  >>  a  and 

therefore  reduces  the  second  term  in  the  expression  for  the  plateau  level. 
Physically  the  effect  of  reducing  the  bandwidth  of  the  filter  is  to  screen 
out  some  of  the  fluctuation,  and  it  seems  reasonable  that  this  should 
Improve  the  performance  if  not  carried  too  far.  It  could  also  be  expected 
that  extreme  reduction  of  filter  bandwidth  would  result  again  in  a 
worsening  of  performance,  and  this  le  clearly  shown  in  Figs.  4.2-8  and  4.2-9. 

Additional  results  contained  in  progress  report  No.  43  deal  with  other 
aspects  of  the  scattering  transfer  functions  for  the  three  models.  Expres¬ 
sions  have  been  obtained  for  the  interfrequency  correlation  function, 
frequency  spreading  function,  and  other  moments  that  will  be  useful  in 
signal  design  and  receiver  optimization.  These  expressions  ere  all  rather 
complex,  and  details  must  be  obtained  from  computer  calculations.  In 
general,  however,  the  results  for  all  three  models  in  regard  to  these 
functions  are  qualitatively  similar. 
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ABSTRACT 


The  purpose  of  this  work  is  to  investigate  a  nonparametric 
classification  procedure  based  on  distribution-free  tolerance  regions. 
The  procedure  is  one  which  gives  some  knowledge  about  how  well 
the  classifier  is  expected  to  perform.  This  is  achieved  by  using 
only  one  sample  of  statistically  independent  observations  from  each 
class . 


The  approach,  which  is  called  the  hypersphere  DFTR 
approach,  is  formulated  in  a  two  class  problem.  The  proposed 
recognition  system  is  one  which  can  be  designed  for  a  given  expected 
false  alarm  probability  or  for  a  given  confidence  that  the  false  alarm 
probability  is  less  than  a  given  amount.  A  few  procedures  are  pre¬ 
sented  which  have  certain  desirable  properties  and  which  appear  to 
do  a  good  job  of  minimizing  the  miss  probability. 

Three  principal  DFTR  procedures  are  presented.  The  small 
and  large  sample  properties  of  these  procedures  are  investigated  and 
the  procedures  are  compared.  A  procedure  for  obtaining  a  measure 
of  the  miss  probability  is  also  presented. 

The  procedures  are  tested  in  an  automatic  speaker  verifica¬ 
tion  experiment.  A  comparison  is  made  of  the  test  false  alarm  rate 
with  the  95%  upper  tolerance  limit  on  the  false  alarm  probability  and 
also  with  the  expected  false  alarm  probability.  In  the  experiment 
all  test  false  alarm  rates  fell  below  the  95%  upper  tolerance  limit. 

The  average  test  false  alarm  for  the  21  different  cases  studied  here 
was  approximately  equal  to  0.  8  of  the  average  expected  false  alarm 
probability . 

A  comparison  is  made  of  the  test  miss  rate  with  a  measure 
of  the  miss  probability  that  was  obtained  by  using  a  tolerance  region 
approach.  In  the  speaker  verification  experiment  all  test  miss  rates 
fell  below  the  95%  upper  tolerance  limit.  For  the  21  different  cases 
studied,  the  average  test  miss  rate  was  equal  to  0.82  of  the  average 
expected  miss  rate. 

Finally,  the  probability  of  error  for  the  hypersphere  DFTR 
proceduie  is  theoretically  compared  with  the  probability  of  error 
for  the  nearest-neighbor  rule  without  assuming  the  form  of  the  class 
probability  distributions. 
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Chapter  1 


INTRODUCTION 

The  ability  to  recognize  and  to  respond  to  visual,  auditory,  or 
other  patterns  can  be  regarded  as  a  prerequisite  for  any  intelligent 
behavior,  and  it  is,  in  fact,  possessed  by  all  living  things  to  some  degree. 
In  general  it  can  probably  be  said  that  the  more  intelligent  an  animal  is 
the  greater  is  the  repertoire  of  patterns  that  it  can  recognize.  Certainly 
the  number  of  patterns  that  can  be  recognized  by  human  beings,  ranging 
over  auditory  patterns  such  as  speech  sound,  music,  sounds  of  nature, 
etc.  to  visual  patterns  such  as  those  made  by  objects,  faces,  letters, 
etc.  is  so  vast  as  to  defy  enumeration. 

Si  ice  the  early  days  of  computers,  attempts  have  been  made  to 
enable  these  supposedly  intelligent  machines  to  recognize  patterns  also. 

To  some  extent  these  attempts  have  been  quite  successful.  Every  com¬ 
puter  possesses  the  ability  to  recognize  the  symbols  of  its  machine- 
language  alphabet,  and  the  developments  in  computer  software  over  the 
last  dozen  years  have  shown  that  computers  can  be  made  to  recognize 
rather  intricate  input  patterns  that  seem  quite  far  removed  from  the 
basic  machine  language. 

A  major  difference  between  the  ability  of  computers  and  of  living 
beings  to  recognize  patterns  appears, however,  to  be  that  the  latter  can 
recognize  patterns  that  they  have  never  observed  before  while  the  former 
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can  generally  not  do  this.  Thus  a  person  has  no  difficulty  in  recognizing 
an  object  to  be,  say,  a  glass,  even  though  the  precise  shape  or  color 
may  be  quite  different  from  any  that  he  has  seen  before.  Apparently  the 
human  pattern  recognizer  is  able  to  react  to  general  features  that  cate¬ 
gorize  the  pattern  without  being  put  off  by  details  that  are  somehow  under¬ 
stood  to  be  irrelevant.  This  ability  is  one  that,  so  far,  machines  possess 
only  very  imperfectly. 

There  are  many  machine  pattern  recognition  tasks  which  up  to 

4 

now  have  not  been  solved  satisfactorily.  Some  of  these  are  recognition 
of  a  person  from  his  handwriting,  or  from  his  voice,  or  from  his  picture; 
recognition  of  spoken  messages  regardless  of  the  speaker;  and  recognition 
of  complex  structural  images  from  pictures. 

Automatic  pattern  recognition  has  been  attempted  in  many  fields. 
For  instance,  in  medicine  pattern  recognition  is  generally  used  by  the 
doctor  for  diagnosis.  Machine  recognition  is  being  investigated  for  such 
seemingly  applicable  tasks  as  the  analysis  of  electrocardiograms,  electro¬ 
encephalograms  and  blood  cell  photos.  Other  examples  of  areas  in  which 
machine  pattern  recognition  is  being  applied  are  physics,  geology,  and 
meteorology.  In  physics,  automatic  pattern  recognition  is  being  used 
for  particle  tracking  in  bubble  chambers.  Recognition  of  the  location  of 
oil  deposits  through  seismic  and  magnetic  signal  analysis  is  being 
attempted  by  geologists.  Meteorologists  are  investigating  weather  pre¬ 
diction  through  the  analysis  of  cloud  photographs. 


Model 


An  often-used  model  for  a  pattern  recognizer  was  proposed  by 
Marill  and  Green  (I960).  This  model  is  shown  in  Figure  1.  It  consists 
of  two  important  parts,  the  receptor  and  the  categorizer  (or  classifier). 

The  receptor  transforms  the  input  data,  which  might  be  :he 
motion  of  a  transducer  or  the  output  of  an  optical  scanner,  into  a  measure¬ 
ment  space  of  high  dimensionality  in  which  observations  from  the  same 
class  cluster  and  observations  from  unlike  classes  separate.  The  trans- 
formation  may  be  linear  or  nonlinear,  information  preserving  or 
destroying.  The  categorizer  determines  the  decision  regions  in  the 
measurement  space  and  tests  the  proximity  of  an  unclassified  observation 
to  these  regions. 

Decision  Theory 

Fundamental  to  the  design  of  the  categorizer  is  statistical  decision 
theory,  c.f.  Wald  (1950),  Blackwell  and  Girshick  (1954),  Anderson  (1958). 
We  briefly  review  some  of  this  theory  that  is  applicable  to  our  problem. 

i 

Suppose  an  observation  is  to  be  classified  into  one  of  several 
classes.  The  observation  is  represented  by  a  measurement  vector  v 
in  measurement  space  p  .  The  classification  procedure  can  be  described 
as  a  mapping  of  measurement  space  u  into  the  i  =  1, .  . .  ,K  classes. 

Let  IL  be  the  region  of  the  measurement  space  which  is  mapped  into 
class  i,  i  =  1, . .  . ,  K  .  If  a  new  observation  falls  in  R.  ,  it  is  classified 
into  class  i  .  Let  be  the  a  priori  probability  that  the  observation 


Figure  1.  ,  Model  of  a  Pattern  Recognizer 


belongs  to  class  i  ,  1st  f^(v)  be  the  probability  density  function  of  the 
observation  v  ,  assuming  that  it  is  a  member  of  class  i  ,  and  let  C.(j) 
be  the  cost  of  deciding  that  v  is  a  member  of  class  j  when  it  is  a 


member  of  class  i  . 


The  expected  risk  or  loss  in  making  decisions  is 


K  K 


£  £4  £  C  (j)C  f.(v)dv. 
i=l  j=l  JR. 


(1.1) 


According  to  the  Bayes  criterion  the  expected  loss  is  minimized  by 
deciding  that  v  belongs  to  class  k  when 


K 


K 


£  ti  Uv)  C  (k)  <  £  $  f.(v)  C  (j) 
i=l  i=l 


(1.2) 


for  all  j  .  * 

Suppose  the  cost  of  making  a  correct  decision  is  zero  and  the 
cost  of  making  an  incorrect  decision  is  equal  to  C  .  Then  C^(j)  =  ^  j  _ 

K 

By  subtracting  C  £  £.  f . (_v)  from  both  sides  of  equation  1.  2  and  by 

,  i=i 

i^j»k 

dividing  through  by  C  we  obtain  the  decision  rule  for  deciding  in  favor 
of  class  k  .  The  decision  is  made  that  v  belongs  to  class  k  when 


V-1  - ?  k  fk^> 


(1.3) 


for  all  j  . 


It  is  sometimes  convenient  to  formulate  the  decision  rule  in  terms 


of  the  likelihood  ratio. 


f,  (v) 


Rewriting  equation  1.  3  in  terms  of  the  likelihood  ratio,  one  decides  in 

favor  of  class  k  if 


(1.4) 


for  all  j  . 

For  a  two-class  problem,  one  simply  compares  the  likelihood 
ratio  with  a  constant.  Many  criteria  yield  decision  rules  involving  like¬ 
lihood  ratio  comparisons.  Some  of  these  are  the  Neyman- Pear  son 
criterion,  the  Ideal  Observer  criterion,  and  the  Minimax  criterion,  c.f. 
Van  Trees  (1968). 

In  the  following  chapters  the  two-class  problem  is  discussed  at 
length.  For  convenience,  the  two  conditional  probabilities  of  error  will 
be  defined  as  follows.  The  conditional  probability  of  deciding  in  favor 
of  class  2  when  the  observation  belongs  to  class  1  (false  acceptance  of 
class  2)  is  called  the  false  alarm  probability.  The  mathematical  notation 

l 

i 

for  this  conditional  probability  is 


(l.*5) 


The  conditional  probability  of  deciding  in  favor  of  class  1  when  the 
observation  belongs  to  class  2  is  called  the  miss  probability.  It  is 
denoted  by 


(1.6) 


The  use  of  the  terms  "false  alarm"  and  "miss"  implies  that  classes  2 
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and  1  are  associated  respectively  with  the  occurrence  or  nonoccurrence 
of  some  event  (such  as  the  presence  of  a  target  on  a  radar  screen). 

These  terms  are  more  appropriate  for  this  thesis  than  the  lees  descriptive 
"error  of  the  first  kind"  and  "error  of  the  second  kind"  because  the  major 
problem  dealt  with  here  involves  a  main  class  and  an  impostor  class,  c.f. 
Section  4. 1. 

Classification  Methods 

In  general,  the  conditional  joint  probability  densities,  f.(v)  , 
are  not  known  in  pattern  recognition  problems.  Usually  the  only  informa¬ 
tion  available  for  designing  the  pattern  recognizer  is  a  limited  set  of 
properly  classified  data.  In  this  case  an  optimum  solution  in  the  Bayes 
sense  is  not  applicable. 

h,et  us  briefly  define  some  of  the  terminology  which  distinguishes 
the  classification  procedures.  A  machine  is  said  to  "learn"  when  it  is 
able  to  improve  its  performance  by  benefiting  from  its  past  experience. 

The  period  of  constructing  decision  regions  is  called  the  training  period. 
This  is  distinguished  from  the  recognition  or  test  period  in  which 
observations  of  unknown  classification  are  classified.  Pattern  recognition 
may  be  accomplished  by  supervised  or  unsupervised  learning.  In  the 
former,  the  training  observations  are  of  known  classification  and  in  the 
latter,  which  is  sometimes  called  learning  without  a  teacher,  the  training 
observations  are  of  unknown  classification. 


Some  of  the  methods  which  have  been  proposed  for  solving  the 


classification  problem  in  pattern  recognition  are: 


(1)  Optimum  Solution  with  Assumed  Probability  Densities:  Functional 
forms  for  the  conditional  densities  f^(v)  ,  i  =  1, . . .  ,K  are  assumed  to 
be  known.  Some  of  the  parameters  of  these  densities  are  often  estimated 
from  training  observations.  The  optimum  decision  regions  are  found 
for  these  assumptions.  Then  new  observations  are  classified.  If  the 
results  are  not  satisfactory,  the  assumptions  are  revised  and  new  decision 
regions  are  formed. 

(2'  Estimation  or  Approximation  of  the  Probability  Densities:  The 
class  probability  densities  are  estimated  or  approximated  usin£  the 
training  observations.  A  new  observation  is  then  classified  according 
to  .^.yes  rule,  where  the  estimated  probability  densities  are  substituted 
for  the  true  probability  densities. 

(3)  Estimation  or  Approximation  of  the  Class  Discriminating 
Boundaries:  A  structure  for  the  boundary  which  partitions  the  measure¬ 
ment  space  into  decision  regions  is  assumed.  The  structures  for  the 

discriminating  boundary  range  from  the  simple  hyperplane  to  complex 

m 

surfaces  of  the  form  £  w.  0.  where  the  0.  are  functions  of  the  meas- 

i=l  i  i  i 

urement  space  and  m  is  finite.  The  discriminating  boundary  is  then 
trained  for  the  "best"  results.  "Best"  results  usually  means  minimum 
probability  of  error  when  the  class  probability  densities  are  assumed  or 
minimum  misclassification  of  the  training  observations  when  the  class 
probability  densities  are  not  assumed. 
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(4)  Other  Intuitive  Criteria:  These  include  such  approaches  as 


maximization  of  entropy,  minimization  of  intraclass  distance  around 
characteristic  points  of  the  classes,  the  Fisher  Criterion,  and  the 
nearest»neighbor  rule. 

Appendix  D  gives  a  review  of  these  classification  methods  along 
with  the  associated  references.  It  should  be  noted  that  a  comparison  of 
the  above  approaches  is  very  difficult  since  the  criterion  for  a  good 
pattern  recognizer  varies  from  author  to  author  and  since  the  data  sets 

on  which  the  pattern  recognizers  are  tested  are  usually  different. 

/ 

Review  of  the  Contents 

It  is  the  intention  of  the  present  work  to  investigate  a  nonpar ametric 
classification  procedure  based  on  distribution-free  tolerance  regions. 

This  procedure  is  one  which  gives  some  knowledge  about  how  well  the 
classifier  is  expected  to  perform.  This  is  achieved  by  using  only  one 
sample  of  statistically  independent  observations  from  each  class.  The 
classification  procedure  is  then  applied  to  a  practical  pattern  recognition 
problem. 

In  Chapter  2  a  brief  review  of  the  theory  of  distribution-free 
tolerance  regions  is  presented.  A  detailed  study  of  this  theory  is  made 
in  Appendix  A.  A  review  of  existing  methods  for  applying  the  theory  of 
distribution-free  tolerance  regions  to  classification  problems  also  appears 
in  Chapter  2. 

The  effectiveness  of  certain  methods  for  constructing  distribution- 
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free  tolerance  regions  for  c  la  a  a  if  ic  at  ion  purposes  is  investigated  in 
Chapter  3.  The  approach,  which  is  called  the  hypersphere  DFTR  approach, 
is  formulated  in  a  two-class  problem.  The  proposed  recognition  system 
is  one  which  can  be  designed  for  a  given  expected  false  alarm  proba¬ 
bility  or  for  a  given  confidence  thi-t  the  false  alarm  probability  is  less 
than  a  given  amount.  It  is  assumed  that  the  only  information  available 
for  designing  the  recognizer  is  a  properly  labeled  sample  of  statistically 
independent  observations  from  each  class.  A  few  procedures  are  pre¬ 
sented  which  have  certain  desirable  properties  and  which  appear  to  do  a 
/ 

good  job  of  minimizing  the  miss  probability.  A  procedure  for  obtaining 
a  measure  of  the  miss  probability  is  also  presented.  The  extension  of 
the  hypersphere  DFTR  procedure  to  the  multiclass  problem  is  also 
discussed. 

Chapter  4  report  on  an  automatic  speai  or  verifica  on  system 
and  its  use  in  testing  the  hyper  sphere  DFTR  classification  schemes. 

A  theoretical  comparison  of  the  probability  of  error  for  a  hyper¬ 
sphere  DFTR  procedure  with  the  probability  of  error  for  the  nearest- 
neighbor  rule  is  presented  in  Chapter  5. 

Chapter  6  presents  conclusions  and  lists  suggestions  for  further 

study. 


4 

4  Pr 
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Chapter  2 


DISTRIBUTION-FREE  TOLERANCE  REGIONS 
AND  CLASSIFICATION 


Introduction 

Existing  classification  methods  which  involve  the  theory  of 
distribution-free  tolerance  regions  are  discussed  in  this  chapter.  The 
chapter  begins  with  a  brief  review  of  the  theory  of  distribution-free 
tolerance  regions.  For  further  details  on  this  subject,  see  Appendix  A. 
Later  in  Chapter  2  classification  procedures  by  Anderson  (1966)  are 
presented.  Next,  the  use  of  statistically  equivalent  blocks  and  the 
empirical  Bayes  approach  by  Patrick  (1966)  is  discussed.  Later,  a 
method  by  Quesenberry  and  Gessaman  (1968),  which  involves  regions 
of  indecision,  is  presented. 


A  Brief  Review  of  Distribution-Free  Tolerance  Regions 

Suppose  n^  independent  observations,  X^X^, .  . .  ,  X^  ,  are 

available  from  a  population  with  continuous  univariate  probability 

density  f.(x).  Let  X„.  <  X._.  <  . . .  <  X.  .  denote  the  observations 

1  U)  (2)  (nj) 

arranged  in  ascending  order  of  magnitude.  It  was  first  shown  by  Wald 

(1941)  that  the  amount  of  probability  in  (X.  X  is  distribution- 

(r)  (n^-r+1) 

free.  Hence  a  statement  such  as 
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(2.1a) 


o  <  a  <  1 


o  <  B  <  1 
0  <  y  <  1 


(2.1b) 


can  be  made  even  though  the  density  f^(x)  is  unknown. 

The  theory  of  distribution-free  tolerance  regions  was  extended 
for  multivariate  distributions  by  Wald  (1943).  He  formed  distribution- 
free  tolerance  regions  by  successive  elimination  of  sample  regions  of 
the  multidimensional  space.  For  example ,  suppose  a  statement  is  to 
be  made  about  the  amount  of  probability  in  the  "center"  of  the  two- 
dimensional  distribution  F^(x^, x^).  Let  the  independent  observations 

/  Xu  V  , 

X.  =  (  )  ,  i  =  1,  .  .  .  ,  n.  be  arranged  in  ascending  order  of  magnitude 

‘  *12 


of  the  Is*  variate,  x^  .  Denote  the  ordered  variate  values  by 

X.,.,  <  X,-.,  <  . .  .  <  X.  ...  Let  r  be  an  integer  which  is  less  than 

(1)1  (2)1  (n^l 

n^/2  .  Let  the  region  for  which  the  first  variate  x^  is  less  than  or 
equal  to  the  r^  smallest  first  variate  of  the  n^  observations  be  "elim¬ 
inated"  from  the  space.  That  is,  the  space  is  partitioned  into  two 
regions.  One  region,  the  region  will  not  be  considered 

in  further  ordering  of  the  observations,  hence  it  is  "eliminated"* 

Since  the  stated  interest  is  in  the  center  portion  of  F^(x^,  x^),  eliminate 


A-12 


the  region  for  which  x,  >  X,  v.  .  Further  eliminate  the  region  for 

1-  (n^rjl 

which  the  second  variate  is  less  than  or  equal  to  the  s^  smallest  second 
variate  of  the  remaining  observations.  Here,  s  is  an  integer  which  is 


(n^-2r-2s)2 


}  . 


less  than  (n^-2r)/2  .  Also  eliminate  the  region  { x.-x^  >  ^ 

The  remaining  region  (  x:X.  <  x  <  X.  ..  ,  X.  .  _  <  x_  <  X .  _  _  .  _ 

—  (r)l  1  (n^-  r)l  (s)2  2  (nj-2r-2s)2 


is  distribution-free.  Figure  2.1  shows  such  a  region  for  s  =  r  =  2  . 

The  procedures  for  forming  distribution-free  tolerance  regions 
have  been  generalized  in  papers  by  Scheffe  and  Tukey  (194  5),  Tukey 
(1947),  Tukey  (1948),  Fraser  and  Wormleighton  (1951),  Fraser  (1951), 
Fraser  (1953),  and  Kemperman  Q956).  A  particularly  useful  general¬ 
ization  is  the  following:  Suppose  that  n^  independent  observations  are 
available  from  a  continuous  D-dimensional  cumulative  distribution 
function  Fj(x^,  x^, .  .  .  ,  x^)  ~  F^(x)  .  Let  h.(  :),  i  =  1,  .  .  .  ,  n^  be  n^ 

functions  such  that  h  (X),  .  .  .  ,h  (X)  are  random  variables  with  a 

i-  ni 

continuous  joint  distribution  function.  The  functions  ln(x)  are  called 
ordering  functions.  They  are  used  to  partition  the  sample  space  into 
n^+  1  mutually  exclusive  and  exhaustive  sample  regions  called  "statis¬ 
tically  equivalent  blocks."  The  regions  are  hereafter  called  simply 
"blocks.  "  Let  X,^  j  be  *be  observation  which  yields  the  smallest  value 
for  the  first  ordering  function,  h^x)  .  Then  the  n^+  1  blocks 


B 


B 


n+1 


I  *  •  •  # 


can  be  defined  as  follows: 


X(1)1X(2)1 


Figure 


X  X  Z 

(n-l)l  (n)l  1 

2.1*  Wald’s  Method  of  Sucessive  Elimination  for 
Forming  Distribution- Free  Tolerance  Regions. 


where  X.j  j  la  the  obaervatlon  for  which 


h.(X.n)=  min  h.  (X  )  , 
^  l<l<nj  1  1 


B2  =  [x:h2(x)  <h2(X  ),  hj(x)  >h1(X(1))J 


(1)' 


(2.2) 


where  X^)  *ke  observation  .X#  excluding  »  for  which  h^fx) 


(2) 

is  minimum, 


h2(X  =  min  h^X^)  * 


1  <  i  <  n^ 

it  (1) 


Bk= 


C^hk(x)  <  V2(k))>  \.iW  >  Vl^k-l)’ 

hj(x)>h1(X  )) 


(2.3) 


where  X/lr|  is  observation,  excluding  X^j*  •  •  2£jv  n  *  f°r 


(k) 

bfcW  is  minimum, 

W)1  = , min  hk(^i) 

1  <  i<  n. 

—  —  x 

i*(l),...,(k-l) 


(k-1) 


(2.4) 


and  B  ,  is  the  space  which  remains  after  the  n,  blocks  have  been 
n^+1  1 

formed, 


B  =  X  -  li  B.  . 
“l+1  1=1  1 


(2.5) 


All  ordering  functions  subsequent  to  h^  may  depend  on  the 
blocks  previously  formed,  all  known  boundary  observations,  and  on 
certain  sets  of  indices.  For  example,  let  h^(x)  =  |x|.  Let  X^j  be 
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the  observation  that  minimises  h.,  h  ( X  )  s  min  h.(X.).  Then 

the  first  block  Bj  consists  of  the  region  inside  the  hypersphere  |x|  = 
|2£ (i)l  *  The  block  Bj  along  with  the  observation  Xjjj  ia  now  elim- 

t 

inated  from  the  sample  space.  The  information  found  in  the  l8t  ordering 

(e.g.  the  location  of  X^jj  »  *he  eiae  oi*  Bj,  etc. )  can  be  used  to  form 

subsequent  blocks.  For  example,  the  ordering  function  h-(x)  =  |x-X  | 

c  (1) 

can  be  used  to  order  the  second  observation  and  form  the  second  block. 

Let  X  the  observation  among  the  remaining  n.  -  1  observations 

(Z)  1 

which  yields  the  smallest  value  for  h^(x)  .  Then 


B2  =  fx:h2(x)  <  h2(X  ,  hj(x)  >  hjfX^j)}  . 


(1 Y 


The  formation  of  the  first  two  blocks  in  this  example  is  illustrated  in 
Figure  2.2  for  a  two-dimensional  vector  x  . 

Note  that  the  first  block  could  have  been  formed  by  choosing 
the  observation  which  gives  the  largest  value  for  hj(x)  .  Of  course, 
this  block  would  not  be  the  same  as  B^  of  Figure  2.  2.  Distribution- 
free  tolerance  regions  can  even  be  formed  by  choosing  the  r**1  smallest 
or  r^  largest  value  for  hj,  c.f.  Fraser  (1957). 


Classification  Using  Distribution-Free  Tolerance  Regions 

Anderson  (1966)  proposes  various  multivariate  statistical 
techniques  based  on  the  properties  of  statistically  equivalent  blocks. 
He  presents  procedures  for  (1)  testing  the  hypothesis  that  an  unknown 
cumulative  distribution  is  a  specified  one,  (2)  testing  the  hypothesis 
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that  two  unknown  distributions  are  identical,  and  (3)  classifying  an 

observation  into  one  of  two  populations.  We  are  interested  in  the 

classification  techniques.  Let  X  X  be  n  independent  vector 

—  1  —  n 

observations  from  a  population  with  distribution  F(x)  and  Y  ,  .  .  .  ,  Y 

— '  —  1  —  m 

be  m  independent  vector  observations  from  a  population  G(y)  ,  where 
F(x)  is  assumed  different  from  G(yj  .  Let  V  be  a  new  observation 
which  is  drawn  from  one  of  the  two  populations.  The  observation  V 
is  to  be  classified  into  one  of  the  populations.  Anderson  mentions 
several  nonparametric  classification  procedures  based  on  ordering  the 
observations. 

% 

In  one  procedure,  the  blocks  are  forrr  ed  by  ranking  the  pooled 

X  and  Y_  observations.  Consider  the  block  in  which  V  falls.  The 

observation  V  is  classified  according  to  the  majority  of  observations 

defining  the  block.  For  example,  suppose  V  falls  in  a  block  which 

has  four  sides.  If  three  of  these  four  sides  are  drawn  through  X 

observations,  then  V  is  classified  as  an  2£  observation.  In  another 

procedure  the  X  and  Y  observations  are  ordered  separately.  Consider 

the  X -block  and  the  Y  -block  that  V  falls  in.  If  there  are  fewer  Y. 

observations  in  the  X -block  than  X  observations  in  the  Y  -block, 

"T 1 

V  is  classified  as  an  X  observation.  This  procedure  is  similar  to 
the  k^-nearest  neighbor  rule  which  is  discussed  in  Appendix  D.  Other 
similar  classification  procedures  are  mentioned.  Anderson  points  out 
that  some  of  these  classification  procedures  can  be  made  to  depend  on 
n  and  m  so  as  n  and  m  increase  the  probabilities  of  misclas sification 
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will  converge  to  the  probabilities  of  a  procedure  based  on  the  likelihood 
ratio. 

Patrick  (1966)  and  Patrick  and  Fisher  (1967)  present  a  general 

classification  approach  which  they  refer  to  as  an  empirical  Bayes 

approach  for  distribution-free  minimum  conditional  risk  learning  systems. 

# 

This  approach  involves  the  construction  of  distribution-free  tolerance 
regions  for  each  class.  Classification  is  obtained  by  comparing  the 
volumes  of  the  tolerance  regions  for  the  different  classes.  For  example, 
consider  the  tolerance  regions  in  which  a  new  observation  V.  falls. 

Note  that  each  class  has  been  ordered  separately  and  for  every  V.  there 
is  one  tolerance  region  to  be  considered  for  each  class.  The  observa- 
tion  V  is  classified  into  the  class  whose  tolerance  region  has  the 
smallest  volume,  with  appropriate  compensation  being  made  for  the 
loss  functions  and  the  a  priori  class  probabilities. 

The  approaches  of  Anderson  and  of  Patrick  do  not  use  the  blocks 
to  obtain  an  estimate  of  how  well  the  classifier  will  perform.  Since 
the  decision  regions  contain  partial  blocks,  this  information  cannot  be 
obtained  accurately  from  classifiers  of  their  design.  This  fact  becomes 
clearer  as  we  study  Chapter  3. 

A  different  use  for  distribution-free  tolerance  regions  in  class¬ 
ification  is  made  by  Quesenberry  and  Gessaman  (1968).  Emphasis  is 
placed  upon  the  control  of  the  distribution  of  the  conditional  probabilities 
of  error,  i.e.  the  false  alarm  probability  and  the  miss  probability 
in  the  two-class  problem.  This  approach  requires  a  region  in  the 
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measurement  space  which  is  commonly  called  a  reject  region  or  a 


deferred  decision  region.  If  a  new  observation  falls  in  this  region,  no 
decision  is  made.  The  problem  with  their  approach  is  that  the  size  of 
this  region  depends  on  the  location  of  the  observations  from  the  various 
classes  and  on  the  ordering  functions  chosen.  No  control  is  exercised 
over  the  size  of  the  reject  region.  Hence  if  the  distributions  are  "close" 
together  or  if  the  ordering  functions  are  unhappily  chosen,  the  proba¬ 
bility  of  not  making  a  decision  can  be  large. 

Quesenberry  and  Gessaman's  procedure  involves  the  construction 


of  a  distribution-free  tolerance  region  A.  containing  a.  blocks  for 

J  J 

each  distribution  F j  =  1.  .  . . ,  K  .  For  each  set  A.  there  is  a  com- 

J  J  J 

plement  set  A  .  =  X  -  A.  .  Let  R.  be  the  region  in  which  the  decision 

J  J  J 

is  made  that  the  new  observation  comes  from  distribution  F.  .  Let 

J 


Rj  be  given  by 


K 


R.  =  A  .  fl  A 
i=l 


j  j  i=i  1 


Let  Rq  be  the  region  in  which  no  decision  is  made. 


Let  R  be  given 
o. 


by 


R0  =  (  ",  Ai  )  U  \) 

1=1  1=1 


K 


The  probability  of  deciding  any  class  other  than  class  j  when  the  new 
observation  is  from  class  j  is  controlled  since  there  are  no  more 

than  a.  blocks  in  the  regions  for  deciding  any  other  class.  These  ideas 

I 

become  more  transparent  as  this  approach  and  the  hypersphere  DFTR 
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approach  are  investigated  in  Section  3.11. 

The  choice  of  ordering  functions  is  left  to  the  person  who  imple¬ 
ments  the  classifier.  Quesenberry  and  Gessaman  give  examples  of 
appropriate  ordering  functions  for  (1)  two  distributions  with  a  monotone 
likelihood  ratio  and  (2)  two  univariate  normal  distributions.  The  third 
example  which  was  given  is  repeated  below.  Suppose  two  classes  are 
represented  in  a  two-dimensional  space  by  two  distributions,  both  of 
which  are  thought  to  be  unimodal.  A  reasonable  choice  for  A  ^  is  a 

bounded  convex  region  containing  (n.-a  +1)  blocks.  This  can  be 

J  j 

accomplished  in  many  ways.  Figure  2.3  shows  an  artificial  example 
which  was  given  to  illustrate  the  tolerance  region  approach  . 

The  data  was  generated  by  drawing  samples  of  size  nj=  n ^  81  from 
bivariate  normal  distributions  and  with  mean  vectors 


and  covariance  matrices 


1  =  (0,0)  , 

^21*  ^22*  =  *3' 

r1  °i 

v  -  r1  2 1 

Lo  4  J  ' 

Lj~  L  2  9  J  * 

An  ordering  which  was  suggested  by  Tukey  (1947)  was  then  used  to  construct 
the  tolerance  regions.  Figure  2.  3  is  the  resulting  figure  for  a  probability 
of  .  90  that  the  conditional  probability  of  either  error  is  less  than  0.14. 

The  region  for  deciding  class  1  is  ;  tie  region  for  deciding  class  2 
is  ,  and  the  region  for  making  no  decision  is  . 

A  problem  with  the  use  of  distribution-free  tolerance  regions 
for  estimation  of  how  well  the  classifier  will  perform  is  that  the  ordering 
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Figure  2.3«  An  Example  by  Quesenberry  and  Gessaman  (1968). 


functions  and  the  blocks  to  be  used  in  the  decision  regions  should  be 
chosen  without  any  knowledge  of  the  outcome  of  the  observations.  (They 
can,  of  course,  be  based  on  any  a  priori  knowledge. )  Hence  if  nothing 
is  known  about  the  distributions,  a  classifier  which  yields  very  poor 
results  can  be  obtained. 

In  the  following  chapter,  ordering  procedures  are  presented  for 
the  case  where  nothing  is  known  about  the  class  probability  distribu¬ 
tions.  Use  is  made  of  the  fact  that  the  location  of  the  observations  of 
one  class  can  be  used  to  order  the  observations  of  the  other  classes. 
Hence  the  decision  regions  can  conform  to  the  "shape"  of  the  classes. 
These  ordering  procedures  create  decision  regions  suitable  for  multi¬ 
modal  class  distributions.  This  is,  of  course,  hot  the  case  with  the 
ordering  of  Figure  2.8.  Furthermore,  the  procedures  of  the  next 
rhapter  do  not  yield  a  reject  region,  .  Further  comparison  of  this 
approach  with  the  one  of  Ouesenberry  and  Gessaman  is  made  in  Section 


Chapter  3 


THE  KYPERSPHERE  DFTR  APPROACH 

I 

3.1.  Summary 

The  effectiveness  of  certain  methods  for  constructing  distribu¬ 
tion-free tolerance  regions  for  classification  purposes  is  investigated 
in  this  chapter.  The  approach  is  first  formulated  in  a  two  class  problem. 
The  proposed  recognition  system  is  one  which  can  be  designed  for  a 
given  expected  false  alarm  probability  (probability  of  misrecognizing 
a  class  1  event  as  a  class  2  event)  or  for  a  given  confidence  that  the 
probability  of  false  alarm  is  less  than  a  given  amount.  It  is  assumed 
that  the  only  information  available  for  designing  the  recognizer  is  a 
properly  labeled  sample  of  statistically  independent  observations  from 
each  class.  A  few  procedures  are  presented  which  have  certain  desirable 
properties  and  which  appear  to  do  a  good  job  of  minimizing  the  miss 
probability  (probability  of  misrecognizing  a  class  2  event  as  a  class  1 
event).  A  procedure  for  obtaining  a  measure  of  the  miss  probability 
is  also  discussed. 

3.2.  Introduction 

Let  {P./i  e  0  )  ,  where  0  =  {l,...,K  }  is  a  finite  parameter 
space,  be  a  class  of  probability  measures  defined  over  measure  space 
(X,  A,p)  .  Based  on  m  statistically  independent  observations  from 
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P^,  i  =  1, . . .  ,K  a  method  is  sought  for  classifying  an  unknown  observa- 
tion  x  into  one  of  the  K  classes  described  by  . 

Let  us  consider  the  case  when  K  =  2  .  Suppose  that  the  prob¬ 
ability  density  functions  exist  and  are  defined  by 

Pt(X  <x)  =  F^x)  =  CX  £i(x)dti  {«)  1  =  1,2  (3.1) 


Suppose  further  that  the  a  priori  probability  that  the  observation 

2 

x  belongs  to  class  i  is  ;  clearly  £  £.  =  1  .  Using  the  Bayes 


Criterion,  one  decides  that  x  belongs  to  class  1  if 


fj(x)  $2  [C2(l)  -  C2(2)] 
f^)>  [Cjtf)  -  Cj  ( 1 )]  • 


C.(j)  is  the  cost  of  classifying  an  observation  from  class  i  into  class  j  . 

If  the  a  priori  probabilities  are  unknown,  one  can  use  the  Neyman- 
Pearson  criterion  and  maximize 


dF2(x) 


subject  to  the  condition  that 


dFj  (x)  <  CL 
2 


0  <  a  <  1  . 


(3.3) 


(3.4) 


It  is  well  known  that  this  criterion  also  yields  a  likelihood- ratio  test. 
That  is,  one  decides  that  x  belongs  to  class  1  if 


fj(x) 


>  L 


(3.5) 


where  L  is  such  that  equation  3.4  is  satisfied. 


In  the  pattern  recognition  problem  considered  here  it  is  assumed 
that  the  probability  densities  f^(x),  1=1,2  are  unknown.  The  following 
analogue  of  the  Neyman-Pearson  criterion  evolves  naturally.  Given 
n^  statistically  independent  observations  from  P, ‘i  =  1,2  it  is  desirable 
to  maximize 


L  dF2(x) 


(3.6) 


subject  to  the  condition 


e{J  dFj(x)  |  <  ft  0<a< 1 


(3.7a) 


or  subject  to  the  condition 


Pr|J  dFj(x)<  >  y  0<8<1 
*2  0  <  y  <  1 


(3.7b) 


Conditions  (3.7a)  or  (3.7b)  can  be  established  even  though  F^fx) 

is  unknown.  This  is  done  through  the  theory  of  distribution-free  tolerance 

regions.  A  tolerance  region  with  the  property,  E{f  dF  (x))  =  a  ,  is 

.2  1 

known  as  an  o-expected  tolerance  region.  A  tolerance  region  with  the 

property,  Pr[  P  dF  (x)  >  /3)  =  y  ,  is  known  as  a  8  content  tolerance 

R2  1 

region  at  level  y  .  It  should  be  noted  that  E{  f  dF  (x) )  =  a  can  be 

R2  1 

considered  an  (^-confidence  statement  that  a  new  observation  from  F^(x) 
will  fall  in  R^  .  This  fact  is  demonstrated  in  Appendix  C. 


3.3,  Application  of  Distribution-Free  Tolerance  Regions  to  Classification 
As  previously  stated  we  would  like  to  maximize 


under  one  of  the  following  constraints: 


or 


0  <<*<?!  (3.7a) 


0  <  1  (3.7b) 

0  <  y  <  1 

The  problem  is  to  order  the  observations  from  Fj(x)  so  that 
R.  consists  of  the  number  of  blocks  "m"  required  to  satisfy  (3.7a)  or 

Ct  ' 

(3.  7b)  and  so  that  J*  dF  (x)  is  maximized.  The  number  of  blocks 

R2  L 


needed  to  satisfy  equation  3.  7b  can  be  found  by  consulting  tables  of  the 

Beta  distribution,  tables  of  the  cumulative  binomial  distribution,  tables 

by  Somerville  (1958),  or  graphs  hy  Murphy  (1948).  The  number  of 

olocks  needed  to  satisfy  equation  3.  7a  can  be  obtained  from  the  equation 

e(Jr  dFj(x)]  =  m/(nj  +  l)  .  Therefore,  if  equation  3.7a  is  to  be  satisfied, 
2 

m  is  the  largest  integer  less  than  or  equal  to  (n^+1)  QL  . 

.  The  blocks  should  be  constructed  so  that  dF^(x)  is  maximized. 

2 

It  is  assumed  that  the  only  information  given  about  Fj(x)  or  F^(x)  is 
that  they  are  continuous  cumulative  distribution  functions.  Therefore, 


given  only  a  finite  number  of  observations  from  each  class,  one  can  • 

never  be  certain  that  dF^(x)  is  maximized.  A  likely  approach  is 

2 

to  construct  so  that  it  contains  as  many  observations  as  possible, 

All  of  the  P,  observations  can  be  contained  in  R^  if  the  Pj  observa- 


tions,  xj^, . . . ,  are  ordered  by  functions  which  are  in  some  sense 

centered  about  all  n,  of  the  P,  observations,  X^,...,X^.  Then 

Li  -1  n^ 


R^,  is  made  up  of  the  first  m  blocks  established  by  the  ordering 


1 


To  accomplish  this  ordering  consider  the  continuous  functions 
d^,(x,  X^  )  of  the  arbitrary  observation  vector  x  ,  where  k=l , . . . ,  n^ 
j=l, . . ,  These  functions  are  basically  "distance"  functions  that 

satisfy  the  following  conditions: 


1. 


2. 


v*42),^° 

dkj(— k’’  4T  >  =  ° 


(3.8) 


A  simple  example  is 


dkj(x,  Xk2)  )  =  |x-  2lkW  I 


(2) 


(3.9) 


(Note  that  the  subscript  k  is  used  to  label  possibly  differing  functions 
which  can  be  associated  with  each  of  the  observations  from  the  class  P^. 
The  subscript  j  is  used  to  label  possibly  differing  functions  used  to 
form  successive  blocks.  The  need  for  such  functions  is  illustrated  in 
the  next  section. ) 

The  n+1  blocks  B.,  B -,..., B  can  now  be  defined  as  follows: 

1  2  n+1 

First,  let  dj(X jjj  be  the  smallest  distance  between  points  from  the 
two  different  classes;  i.e. 


1.  The  idea  of  using  ordering  functions  which  are  centered  by  the 
observations  was  suggested  by  Professor  I.  R.  Savage. 
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a'12'" 1  ■  ,<r.H 


(3.10) 


Define  the  regions  L^,  k  e  1 , . . . ,  by 

Lkl  =  {x:dkl(x.X(k2>,<d1(x|;>,} 

For  two-dimensional  vectors  x  and  a  metric  as  given  by 


(3.11) 


equation  3.  9,  the  regions  are  seen  to  be  circles  centered  at  the 

points  X.^  with  radius  dj(x|jj  )  .  (see  Fig.  3.1)  The  definition  of 

d^xjjj  )  is  such  that  at  least  one  of  these  circles  contains  a  point  from 

(1) 

class  1  on  its  circumference  (the  point  X[ ^  bi  *be  figure)*  The  probability 
that  there  are  more  than  one  such  points  is  assumed  negligible.  This 
point  is  labeled  2i|i|  an<*  *8  8a^  *°  be  ordered.  It  is  clear  that  for 
n-dimensional  vectors  and  for  the  metric  of  equation  3.9,  the  regions 

a  •  hypersphe  3S. 


Sr 


The  first  block  Bj  is  now  defined  as  the  union  of  all  the  regions 


B.  = 


n2 

U  L 

k-1 


kl 


(3.12) 


To  obtain  the  second  block  the  distance  d_(X )  is  defined  by 

«  ”  ( c  I 


(2) 

d,(xj^)=  min  min  d,  _(X(.1),  X^) 

2  —(2)  .  .  .  .  k2 — i  — k 

1  <  k<  n^  1  <  i<  nj 

i/d) 


(3.13) 


The  implication  of  the  subscript  "2"  of  d^( *)  is  that  d^C*)  d^( •) 

can  be  completely  different  functions.  The  regions  are  then  defined 

as  before  by 
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Figure  3.1.  A  Hypersphere  Ordering, 
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A 

:  ^k2^— *~k^  —  ^2^— (2)  0  ^c  =  ^»  •  •  •  »nj 


(3.14) 


If  ^  an^t\cj(* )  are  both  cf  the  form  of  equation  3.  9  then,  for 
two-dimensional  vectors  the  regions  are  circles  extending  to  the 

next  closest  point  of  class  1,  which  is  labeled  X/?!  * 

The  eecond  block  is  now  given  by 


B2  =  <kU=1Lk2>n5l 


(3.15) 


where  is  the  set  of  points  not  contained  in  Bj.  In  our  example  B^ 
would  consist  of  the  union  of  all  the  annular  areas  between  the  circles  of 
radius  dj(X*|j  )  and  d^xj' j  )  . 

This  procedure  is  continued,  and  therefore  the  r*h  block  is  given 


B  =  (  U  Lkr)  n  B 

k=l  i=l 


where 


k  =  1 . “i 


and  where 


*  d  (xj1})  =  min  min  d,  (X^,  X(2)  )  . 

r—  (r  ,  .  .  .  kr'— x  — k 

1<  k  <  n^  1<  1  < 


The  (nj+1)^^1  block  is 


n^+  1 


i  ^  (!)•••  (r-1) 


nl  nl  _ 

x  -  u  b.  =  x  n  (  n  b.  ) 

i=l  1  i=l  1 


(3.16) 


(3.17) 


It  is  convenient  to  think  of  the  blocks  as  being  generated  by  hyperspheres 

(2) 

(or  other  hypervolumes,  depending  on  the  form  of  d  (x,  X  )  expanding 
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(2) 

from  the  centers  >  k  s  1, .. .  .  The  expansion  continues  until 

the  first  observation  of  class  1  is  reached;  this  observation  is  thereby 
ordered  (i.e.  given  rank  order  (l))and  the  resulting  volume  is  the  first 
block.  Further  blocks  are  generated  by  further  expansions  to  the  remaining 
(n^-l)  Pj observations. 

The  region  is  the  union  of  the  first  m  blocks  formed  by  this 
ordering, 

m 

R,  =  U  B.  . 

2  i=l  1 


The  value  for  m  is  obtained  from  the  constraints  on  the  design  of  the 
classifier.  For  example,  suppose  a  classifier  is  to  be  designed  in  which 
one  has  95%  confidence  that  the  false  alarm  probability  will  be  less  than 
0.  05.  Then  8  and  y  in  equation  3.  7b  are  equal  to  0.  05  and  0.  95, 
respectively.  One  of  the  variables  n^  ,  the  number  of  Pj  training 
observations,  or  m  ,  the  number  of  blocks  used  to  construct  R^  »  is 
now  fixed.  The  value  of  the  other  variable  can  be  found  from  graphs  by 
Murphy  (1948)  or  from  tables  by  Somerville  (1958).  For  example,  we  find 
from  Murphy  for  $  =  0.  95  ,  and  y  =  .  95  ,  and  n^  =  210  observation  from 
Pj  ,  that  6  blocks  may  be  used  to  construct  R^  .  These  numbers  give 
an  expected  value  for  the  false  alarm  probability  of  0.  0284  with  a  standard 
deviation  of  0.0114.  Table  3.1  shows  the  mean  and  standard  deviation  for 
3  values  of  m  and  nj  which  satisfy  the  condition  Q  =  0.05  and  y=0.95. 
As  seen  from  the  table,  when  many  observations  are  available  from  , 
the  expected  false  alarm  can  be  higher  for  the  same  0  and  y  than  when 
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few  observation*  are  available  from  • 

When  a  new  observation  V  is  to  be  classified,  the  following  rule 
is  used.  If  V  falls  in  ,  V  is  classified  as  a  member  of  class  2. 
Otherwise  V  is  classified  as  a  member  of  class  1. 

*  % 

Table  3.1.  Expected  Value  and  Standard  Deviation 
for  0  =  0.05  and  y  =  0.  95  and  for  Different  m  and  n. 


Pr{J  dF  (x)  <  .05}  =  .95 

Z 

EfJ^dFjM) 

oCJ^dFjfx)} 

n^  =  430,  m  = 1 5 

0.0348 

0.0088 

=  210,  m  =  6 

0.0284 

0.0114 

n^  =  58,  m  =  1 

0. 0170 

0.0167 

3.4.  Discussion  of  Practical  Ordering  Functions 

The  purpose  of  this  section  is  to  discuss  some  simple  ordering 
procedures  based  on  the  idea  of  expanding  functions  from  the  P-  observa- 

4 

tions.  The  relative  merits  of  these  procedures  when  applied  to  a  problem 
with  a  limited  sample  size  are  investigated.  For  simplicity,  let  the 
ordering  functions  be  defined  as  follows: 

V*  42))  =  akj|x-42)| 

a.  .  >  1  k  =  1 , . . . .  n, 

kj  -  2 

j  =  1#  •  •  •  * 
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(3.18) 


Then  the  observation®  are  ordered  by  hyperspheres  which  expand 
from  the  observations. 

Distribution-free  tolerance  regions  can  be  formed  by  any  of  the 
following  three  ordering  procedures: 


(1)  All  Hyper  spheres  Expand  (AHE) 


Hyperspheres  expand  at  the  same  rate  from  all  Pj  observations  until 
nj+1  blocks  have  been  formed.  The  first  m  of  these  blocks  make  up 
region  R^  .  Since  the  hyper  spheres  expand  at  the  same  rate,  let 
akj=  **  k  =  1, . . .  ,  n^,  js  l,,,,,nj  ,  Then  the  ordering  functions  are 
given  by 


V&2k  >->-2k  I.  *  =  1 . nl 

k  —  1 ,  • . .  i  n« 


(3.19) 


The  statistically  equivalentJ)locks  are  described  by  equations  3.10  through 
3.17.  This  procedure  is  illustrated  in  Figure  3.2  for  a  two-dimensional 
vector  x  ,  m  =  3  blocks,  n  =  6  ,  n  =  33  with  the  observations  *  , . . .  , 

cl  “1  ~~n.j 

represented  by  O's  and  the  observations  •••  represented  by  X's. 


At  times  the  number  of  blocks  with  which  region  is  formed 
may  be  small  with  respect  to  the  number  of  blocks  that  are  needed  for  a 
reasonably  low  miss  rate  when  using  the  above  ordering  procedure.  This 
situation  can  be  a  direct  result  of  having  few  training  observations 
with  at  least  m  P^  training  observations  being  relatively  close  to  the 
observations.  For  example,  notice  the  two-dimensional  example  of 
Figure  3.16.  The  circular  regions  surrounding  the  P^  observations  are 
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unconnected.  In  this  case  a  procedure,  which  assumes  that  more 
observations  may  be  found  in  the  vicinity  of  the  Pj  observations  which 
have  previously  been  ordered,  allows  the  regions  centered  by  the  P_ 
observations  to  expand  into  regions  which  are  connected  and  which  have 
a  larger  volume  than  the  regions  produced  by  the  AHE  ordering  procedure. 
The  following  two  procedures  use  the  information  from  previously  ordered 
observations  to  allow  region  to  expand  faster  in  directions  away 
from  the  clustered  Pj  observations  than  toward  them. 

(2)  Ordered  Hyper  spheres  Slowed 

With  this  procedure  the  hyper  spheres  which  order  the  P.  observations 

1  \ 

are  not  allowed  to  expand  as  rapidly  in  subsequent  orderings  as  the  other 
hyperspheres.  The  functions  for  ordering  the  first  P^  observation  are 


the  same  as  those  for  procedure  (1).  That  is, 

dw(x.xi2))= 

k  —  1,..., 

(3.20) 

The  funct?  >ns  for  ordering  the  second  P^ 

observation  are 

given  by 

dk2(x.  x'2) )  -  ak2  1*  -  *!?’  1 

k  1,  .I.,  n? 

(3.21) 

where 

\2  >  *ld  =  1 

k  =  (1) 

ak2  =  akl  =  1 

otherwise  . 

• 

Note  that  k  =  (1)  is  any  k  which  satisfies 

dl(— (1)  1  =  akl  I—  ( 1 )  ‘  — if’  '  •  <3-22) 
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The  increase  of  to  a^,^  may  be  viewed  as  a  decrease  in  the  rate 

at  which  the  ordering  hypersphere  expands  from  the  observation, 

(2) 

,  in  search  of  the  next  observation. 

The  functions  for  ordering  the  rth  observation  are  given  by 


dkr<**k))  =  ‘vJ*-£v2)  I 


kr 


(2) 

•k 


(3.23) 


wher  * 


a.  >  a 
kr  k, r-1 

a,  =  a, 

kr  k, r-1 


k  =  (1) . (r-1) 


otherwise  . 


k  =  (j)  is  afty  k  which  satisfies 


J  ~(j) 


(3.24) 


The  adjustment  of  the  multiplicative  constant  a  ,  k=  (1),  . . .  ,^(r-l) 

is  quite  arbitrary.  In  the  speaker  recognition  experiment  to  be  discussed 

in  Chapter  4  the  increase  of  a  to  a  was  made  very  large  so  that 

K,  r-I  kr 

the  differences  in  the  three  ordering  procedures  would  become  evident. 
Suppose  a^r  is  determined  by 


a,  =  (Na  .) 
kr  k, r-1 


N 


k  =  (1 ),  .  .  .  ,  (r-1) 


a  -  a 
kr  k,  r-1 


otherwise  . 


(3.25) 


Suppose  N  is  a  large  number.  This  causes  the  hypersph?res  which 
order  the  observations  essentially  to  stop  expanding  in  relation  to 
the  hyperspheres  which  have  not  ordered  a  Pj  observation.  The  ordering 
procedure  for  this  case  will  be  called  Ordered  Hyperspheres  Constant,  OHC. 
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Suppose  all  the  hyperspheres  have  ordered  a  P,  observation. 

Then  m  >  .  In  ordering  the  (n^+l)**1  observation  the  above  pro¬ 

cedure  causes  all  hyperspheres  to  expand  at  the  same  rate.  Whenever 
the  (n^+l)^  Pj  observation  is  located,  the  hypersphere  which  orders 
this  observation  stops  expanding  in  relation  to  the  other  hyperspheres. 

This  is  because  for  m  =  n,  all  a,  are  now  large,  and  therefore  the 

2  km  ° 

fv.  N 

(n^+lr11  a^  is  again  (Na^  ^  larger  than  the  others.  Figure  3.3 
illustrates  the  OHC  procedure  for  the  same  sample  set  as  used  in  Figure 
3.2,  where  the  X's  and  O's  again  refer  to  the  P^  and  P^  observations, 
respectively. 

\ 

The  first  block  for  the  OHC  procedure  is  the  same  as  the  first 
block  for  the  AHE  procedure.  However,  in  this  example  the  second  block 
for  the  OHC  procedure  differ r  from  the  second  block  for  the  AHE  procedure. 
This  is  because  the  hyperspheres  (circles  in  the  figure)  expand  from  all 
X's  except  Xj  in  search  for  a  new  P^  observation.  The  observation, 
which  is  found  is  and  it  is  intersected  by  the  circle  expanding  from 
X^  .  Then,  in  forming  the  third  block,  circles  expand  from  all  X's 
except  Xj  and  X^  .  Observation  O^  is  found  and  block  is  formed. 

(3)  Conditioned  Hyper  spheres  Slowed 

With  this  procedure  the  growth  of  hyperspheres  which  intersect  the  P^ 
observations  is  slowed  even  if  these  observations  have  been  previously 
ordered  by  other  hyperspheres. 

The  ordering  functions  for  j  =  1  are  equivalent  to  the  ordering 
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functions  for  j  =  1  for  the  previous  two  procedures.  The  remaining 
ordering  functions  can  be  different  from  those  of  the  previous  two  pro¬ 
cedures  because  here  an  ordering  hypersphere  is  slowed  if  it  comes  into 
contact  with  X  jjj  •  —[1]  °^servat^OR  *or  which 


where 


d2(X[jj  )  - 


min  min  lif*-  2EkZ>  I 


1  <  k<  n^  1  <  i  <  n^ 


(3.26) 


ak2  >  akl  =  1 
*k2  S  akl  =  1 


k  =  (1) 
otherwise 


and  k  ?  (1)  is  any  k  which  satisfies  equation  3.22.  Equation  3.26  can 
be  viewed  as  telling  us  that  during  the  second  ordering  the  expanding 
hyper  spheres  have  intersected  a  observation,  •  But  this 

observation  might  well  be  the  Pj  observation  xjJJ  which  has  already 
been  ordered.  For  example,  in  Figure  3.4,  the  observation  which  satisfies 
equation  3.26  is  O^.  But  this  observation  has  already  been  ordered. 
Hence  another  P^  observation  has  to  be  found  before  the  second  block  is 
completed. 

Therefore,  if  Xjjj  *  x|}J  »  denote  by  xj^j  B2  is 

given  by  equations  3.14  and  3.15.  If  x||]  =  — (l)*  a  block  has  not  been 
completed.  Let 


Gk2  =  {-dk2(x,x'2))<dk  . 


(3.27) 


Now  let 


dk3(x.xi2))  =  »1,,l*-X(2) 


k  '  k3 -k 


(3.28) 
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J 


{  where 

I  «kJ>*k2  k»(l),  [1] 

m  i 

ak3  “  a^2  otherwise  .  (3.29) 

i 

\ 

Note  k  =  [1]  is  any  k  for  which 

i 

i  • 

d2(i [!]  >  =  ^2  Isj!]-  * k1 1 1  •  (3-30) 

I 

Now  let 

I 

d  (Xr(1>)=  min  min  d  (X(1),  X<2))  (3.31) 

1  J  1  <  k <  n2  1  <  i<  n.  *  1  K 

and 


Gk3  =  {i!dk3(2'42)>id3(XS>}  •  <3'32> 

This  procedure  is  continued  until  or  (j+l)>n2*  ** 


v(l)iv(D  „d 
— [1]  *  -(1)  d 

(j  +  1)  <  n2  , 

let  Xr(?J  =  X.(JJ  .  The  second  block  is 
-UJ  ~(2) 

,  /  i  "2  V  - 

B-  =  (  U  U  G.  .  JOB  . 

2  Vi  k=i  k>r+1 '  1 

(3.33) 

If  j  -f  1  >  n^  and 

v0)_  V(D 
— Ii]  ~(1)  ' 

i  "  1  f  •  t  •  i  |  let 

j  ,x0h 

V2<2> 

=  min 

1  <  k  <  n? 

mi"  dkn 

1<  i  <IL  2 

(3.  34) 

U(l) 

and 

G,  =  {x:d.  (x,X<2)  )<  d  (xjl’  )) 

kn2  L-  kn2  -  — k  -  n^  ~(Z)  J 

Then 

/  n2 

*4 

n2 

U  G.  .)  OB.  . 

k=1  k, r+1/  1 

(3.  35) 
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Note  that  the  restriction,  (j  +  1)  >  »  la  neceaaary  to  eliminate 

the  possibility  that  the  procedure  enters  an  infinite  loop.  The  condition 

(j+  1)  >  was  chosen  especially  for  the  case  where  the  hyperspheres 

stop  when  they  intersect  a  Pj  observation  (equation  3.25  ,  where  N  is 

a  very  large  number).  In  this  case  all  hyperspheres  are  allowed  to  expand 

until  they  intersect  the  Pj  observation  which  normally  would  cause  the 

procedure  to  enter  an  infinite  loop.  Then  they  are  allowed  to  expand  past 

this  observation.  For  example,  consider  Figure  3.4.  The  observation 

O  will  stop  every  expanding  hypersphere.  The  condition  (j+1)  >n? 

1 1 )  c 

allows  all  four  hyporspheres  to  expand  to  and  then  to  expand  past 

to  form  the  second  block. 

The  extension  of  this  procedure  for  the  formation  of  m  blocks 
is  straightforward.  Figure  3.  5  illustrates  the  procedure  for  the  same 
s an i pic  set  as  used  in  Figures  3.  2  and  3.  3  and  for  N  equal  to  a  very 
la  lumber  and 


a,  =  (N  a  . 

kr  k, r-1 

a.  =  a. 
kr  k, r-1 


k  =  (1),  [1],  [2],...,[j] 


otherwise  . 


(3.36) 


This  ordering  is  called  CHS,  Conditioned  Hyperspheres  Stop.  In  this 
example  the  second  block  for  the  CHS  procedure  differs  from  the  second 
block  for  the  OHC  procedure.  In  forming  the  second  block  the  hyper  spheres 
are  expanding  from  all  X's  except  Xj  .  When  the  hypersphere  expanding 
from  X^  intersects  Oj,  it  stops  in  the  CHS  procedure,  even  though  a 
block  has  not  been  completed.  Hyperspheres  continue  to  expand  from 
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observations  X^,  X^,  Xg,  and  in  aearch  of  a  new  0  observation. 

The  hypersphere  which  expanda  from  Xg  intersects  and  the  second 
block  is  complete.  Then  hyperspheres  expand  from  X^,  Xg,  and 
in  search  of  a  new  observation.  The  procedure  is  continued  in  this 
manner. 

The  resulting  region  (for  m  =  3  blocks)  is  shown  in  Figure 
3.  6  for  each  of  the  three  procedures.  In  this  particular  example  it  is 
seen  that  the  AHE  procedure  produces  a  region  R  which  has  expanded 
into  the  O's  whereas  the  CHS  procedure  produces  a  region  which 
has  been  stopped  by  the  O's  and  has  expanded  in  a  direction  away  from 
the  ordered  O's. 

It  is  evident  that  these  three  procedures  are  not  the  only  procedures 
that  can  ve  formulated  when  hyperspheres  expand  from  the  observa¬ 

tions.  For  example,  one  might  decide  to  slow  the  expansion  of  any 
ordering  hypersphere  which  is  in  the  vicinity  of  an  ordered  observa¬ 

tion.  However,  this  procedure  would  bias  the  estimate  of  the  miss 
probability,  which  is  discussed  in  section  3.9. 

A  comparison  of  the  three  ordering  procedtires  requires  iteration 
of  all  possible  sample  sets.  Nevertheless,  some  general  observations 
can  be  made. 

1)  The  AHE  procedure  is  probably  preferable  to  the  OHC  procedure, 
which  is  probably  preferable  to  the  CHS  procedure,  if  spurious  Fj  obser¬ 
vations  are  involved.  This  is  because  hyperspheres  in  the  AHE  proce¬ 
dure,  and  in  the  OHC  procedure  to  a  lesser  extent,  continue  to  expand 
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past  the  peripheral  observations.  For  example,  consider  Figure 
3.7  where  two  blocks  make  up  R^  .  The  resulting  R^  for  the  AHE 
and  the  CHS  procedures  is  shown  in  the  figure.  Notice  that  the  hyper¬ 
spheres  which  expand  from  X.  and  X-  stop  when  they  intersect  O... 

i  c  (1) 

in  the  CHS  procedure.  They,  of  course,  do  not  stop  in  the  AHE  procedure. 

The  area  of  R^  for  the  AHE  procedure  is  equal  to  the  area  of 
for  the  CHS  procedure  plus  the  crosshatched  area.  Hence  the  miss 
probability  in  this  case  is  less  for  the  AHE  procedure  than  for  the  CHS 
procedure. 

2)  The  CHS  procedure  is  probably  preferable  to  the  OHC 
procedure  which  is  probably  preferable  to  the  AHE  procedure  if  the  Pj 
observations  are  tightly  clustered  in  two  or  more  clusters  and  the 
clusters  are  different  distances  from  the  P^  observations.  This  is 
because  the  hyper  spheres  in  the  CHS  procedure,  and  in  the  OHC  proce¬ 
dure  to  a  lesser  extent,  expand  more  in  directions  away  from  the  ordered 
Pj  observations  than  do  the  hyperspheres  of  the  AHE  procedure.  For 
example,  consider  Figure  3.  8  where  two  blocks  are  formed  with  the 
AHE  and  the  CHS  ordering  procedures.  The  area  of  the  CHS  procedure 
is  equal  to  the  area  of  the  AHE  procedure  plus  the  crosshatched  region 
minus  the  shaded  area.  Since  the  crosshatched  area  is  much  larger 
than  the  shaded  area,  one  may  feel  that  the  miss  probability  in  this  case 
is  less  for  the  CHS  procedure  than  for  the  AHE  procedure, 
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Figure  3* 7.  A  Comparison  in  Favor  of  the  AHE  Procedure. 


!•** 

Figure  3.8.  A  Comparison  in  Favor  of  the  OHS  Procedure. 
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3.  5.  Programming  on  a  Digital  Computer 

Note  that  these  three  procedures  are  very  easily  programmed  on 

a  digital  computer.  One  simply  calculates  the  distance  between  every 

Pj  and  training  observation.  Let  the  distance  between  X  and 

(2) 

X.  be  denoted  by  D...  These  distances  are  arranged  in  a  matrix  D  , 
— J  ij 


where  the  i*h  row  consists  of  the  distances  between  xf^  and  x\C}  , 

-i  “J  ' 


.(2) 


j  =  1 i ... i n^  • 


Consider  first  the  AHE  procedure.  A  search  is  made  through 

the  elements  of  the  matrix  for  the  smallest  distance.  Let  this  distance 

% 

be  D,  .  Then  the  k*h  p  observation  is  ordered  and  a  block  is  formed, 
kr  1 

The  kth  row  is  multiplied  by  the  largest  number  available  on  the  machine. 
This  removes  the  k*h  P^  observation  from  further  ordering.  A  search 
is  now  made  through  the  elements  of  the  new  matrix  for  the  smallest 
distance.  This  procedure  is  continued  until  m  blocks  are  formed. 

In  the  ordered  hyperspheres  slowed  procedure  a  search  is  made 
through  the  elements  of  the  matrix  D  for  the  smallest  distance  as  before. 
Let  thi6  distance  be  .  The  k**1  row  is  then  multiplied  by  the  largest 

number  available  on  the  machine.  Thus  far,  the  two  procedures  are 
the  same.  Now  the  r*h  column  is  multiplied  by  a  number  which  controls 
the  rate  at  which  the  hypersphere  expands  from  the  r^1  observation. 

4 

A  search  is  made  through  the  elements  of  the  new  matrix  for  the  smallest 
distance.  This  procedure  is  continued  with  the  columns  and  the  rows 
corresponding  to  the  smallest  distance  being  multiplied  by  the  appropriate 
numbers  after  each  block  is  formed. 
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Now  consider  the  conditioned  hypersphere  slowed  procedure. 

As  in  the  previous  two  procedures  a  search  is  made  for  the  smallest 

distance.  Let  this  distance  be  D  .  The  r^1  column  is  now  multiplied 

JK  r 

by  a  number  which  controls  the  rate  at  which  the  hyper  sphere  expands 
from  the  r^  observation.  Unlike  the  previous  two  procedures,  the 

tVi 

column  remains  untouched.  A  search  is  made  through  the  elements 

of  the  new  matrix  for  the  smallest  distance.  Let  this  distance  be  D  . 

st 

If  s  =  k  ,  a  block  has  not  been  formed.  In  any  case  the  t*h  column  is 

multiplied  by  a  number  which  controls  the  rate  at  which  the  hypersphere 

expands  from  the  t*h  observation.  This  procedure  is  continued  until  a 

smallest  distance  D  .  is  found  so  that  g  /  k  or  until  a  restriction  on 

gh 

the  number  of  iterations  is  reached  ((j+1)  >  n^  in  the  above  discussion). 

If  a  smallest  distance  D  _  is  found  so  that  g  i  k  ,  the  second  block 

gh 

is  formed.  The  h**1  column  is  multiplied  by  a  number  which  controls 
the  rate  at  which  the  hyper  sphere  expands  from  the  h*“  observation 
and  the  procedure  is  continued.  If  the  restriction  on  the  number  of 
iterations  is  reached,  the  r*h  row  is  multiplied  by  the  largest  number 
available  on  the  machine.  A  search  is  then  made  through  the  elements 
of  the  new  matrix  for  the  smallest  distance.  When  this  distance  is 
found,  the  second  block  is  formed.  The  procedure  is  continued  in  this 
fashion  until  m  blocks  are  formed. 

Note  that  when  a  new  observe -ion  is  to  be  classified,  the  following 

\ 

information  must  be  stored  in  the  computer  for  the  various  procedures. 
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1) 


AHE  —  all  observations  and  the  value  of  the  order 

statistic . 

2)  OHC  —  all  observations  and  the  value  of  the  m  order 

statistics  along  with  the  indices  of  the  observations  corresponding 

to  the  in  order  statistics. 

3)  CHS  —  all  P^  observations  and  the  values  of  the  distances 
along  with  the  corresponding  indices  of  the  P^  observations  found  in 
the  ordering  process.  This  procedure  requires  at  least  as  much  storage 
as  the  OHC  procedure. 

It  is  obvious  that  the  information  to  be  stored  can  be  reduced 
further  by  clustering  the  P^  observations  and  using  representative 
points  for  the  clusters  (for  example  the  means  of  the  clusters)  to  order 
the  Pj  observations.  However,  this  approach  biases  the  estimate  of 
the  miss  probability  as  seen  in  section  3.9. 

On  the  other  hand,  in  a  situation  where  too  few  P^  observations 
are  available,  (see  Fig.  3.16)  one  can  sometimes  cause  the  regions  of 
to  be  connected  by  adding  "fictitious"  P^  observations  between  the 
P  observations  whose  nearest  P?  observation  is  furthest  away. 
Hyperspheres  expand  from  these  "fictitious"  P^  observations  in  the 
same  manner  as  they  did  from  the  "real"  P^  observations. 

3,6.  Large  Sample  Properties 

Three  methods  were  proposed  in  section  3.4  for  the  classification 
of  observations  from  two  different  classes.  The  large  sample  properties 
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of  these  methods  are  investigated  in  this  section.  The  goal  for  the 
nonparametric  method  is  the  emulation  of  the  Neyman-Pearson  rule, 
which  was  stated  in  equations  3.  3  and  3.4. 

Let  n^  ,  the  number  of  observations  from  ,  and  n^  ,  the 
number  of  observations  from  P  ,  approach  infinity  such  that  n  /n 
is  bounded  away  from  zero  and  infinity.  Let  the  classifier  be  designed 
for 


where  m  <nj+l  .  Hence  m  approaches  infinity  while  m/(nj+l)  =  a  . 

The  false  alarm  probability  converges  in  probability  to  the 
desired  value  a.  as  n^  approaches  infinity,  i.  e. 

dF  i  (x) - >  a.  . 


This  follows  directly  from  the  Tchebycheff  inequality  since  the  variance 

of  P  dF  (x)  ,  considered  as  a  random  variable,  approaches  zero  as 
R2  1  “ 

n^  approaches  infinity.  . 

We  now  wish  to  determine  the  outcome  <  dF_(x)  as  n.  and 

J  2  —  1 


n?  approach  infinity.  For  simplicity,  suppose  that  the  probability 

w 

densities  fj(x)  and  f ^ (>£)  are  continuous,  univariate,  unimodal,  and 
nonzero  everywhere.  Furthermore,  let  n^  =  n^  =  n  .  Consider  the 
DFTR  method  called  All  Hyperspheres  Expand  (AHE). 

Figure  3.9  gives  a  general  picture  of  the  situation  to  be  discussed. 
As  in  the  previous  examples,  the  P^  observations  are  represented  as 
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a 


b 


Figure  3.9,  A  Cne-Dimensional  Example, 


Length 


o.G  L-  -  - — f - f - 1 - 1 - 1 — 

24  50  100  250  500  1000 

lumber  of  observations  from  each  class 


Figure  3.10.  Length  of  the  Accept  Region  on  the  Positive  Line, 
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X's  and  the  observations  as  O's  . 

As  .n  becomes  large,  the  ratio  of  P ,  observations  to  P.?  obser- 

\ 

vations  in  any  small  region  R  approaches  the  average  likelihood  ratio 
existing  in  this  region,  ( f ^(xj/f^x)  )  .  Since  fj(x)  and  f^(x)  were 

assumed  to  be  continuous,  it  can  be  assumed  that  as  R  becomes  very 
small  (  £^'(x)/f^(x)  )  is  approximately  constant  in  R  . 

Consider  a  small  interval  [a,b]  under  the  peak  of  fj(x). 

Suppose  that  a  and  f^(x)  are  such  that  in  this  region 

a<  <J*  fj(x)dx  .  (3.37) 

The  number  of  observations  (O's  in  Fig.  3.9)  in  [a,b] 
approaches  n  fN.(x)  dx  as  n  -*  ®  ,  by  Tchebycheff  s  theorem.  Since 
the  number  of  blocks  "m"  is  approximately  equal  to  an  ,  equation 
3.  37  implies  that 

m  <<  n  J^fj(x)  dx  =  number  of  O's  in  [a,b]. 

By  the  assumption  that  £^(y)  is  nonzero  everywhere,  there  are 
X's  in  the  interval  [a,b]  with  probability  1  ,  (as  n  -*  00  )  .  Suppose  that 
in  the  interval  [a,b] 

f^xJ/fjfx)  <  <  a  . 

This  means  that  mf^(x)  >  >~  nf^(x)  .  Therefore  on  the  average  each 

m 

X  is  surrounded  by  many  more  than  m  O's.  Then  the  region  R  =  U  B. 

i=l  1 

consists  largely  of  short,  unconnected  intervals  centered  on  the  X's. 

We  now  wish  to  determine  the  average  length  of  these  intervals.  The 

average  number  of  O's  in  [a,b]  is  n  f^f  (x)dx  .  Therefore 

J  a  i 
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(b-a) /  n  f  ^(x) dx  is  the  average  distance  between  the  O's  .  Since 
each  X  is  surrounded  by  many  more  than  m  O's  ,  the  length  of  each 
interval  can  be  no  more  than  - — .  The  length  of  each  interval 


n  J  f  j  (x)  dx 


is  in  f ac  t  equal  to 


m(b  -  a) 


if  the  m  closest  O's  to  an  X  are  as 


n  j  f  ,  ( x )  d  x 
J  a  1 

likely  to  be  in  [a,b]  as  in  all  other  regions.  In  this  case  as  n  -*  00  the 

length  of  each  interval  in  [a,bl  is.o;(b-a)/  f^f  (x)dx  .  Since  this  is  a 

*■'  a  1 

finite  number,  there  would  be  nonzero  sections  in  areas  where  f  (x)/f^(x) 
is  very  small.  This  is  not  the  case  when  is  determined  by 

fj(x)/f^(x)  >  C  where  C  is  a  threshold. 

Note  that  all  assumptions  are  such  as  to  minimize  the  extent  of 
region  in  places  where  f^(x)  is  large.  Thus  if  these  assumptions 

are  removed,  the  result  holds  a  fortiori.  For  example,  consider  the 


a  s  sumption 


f 2 (x;) /f  j  (x)  <<  Oi 


in  the  interval  [a,b]  .  If  this  assumption  is  not  made,  we  cannot  say 
that  each  X  in  [a,b]  is  a  nucleus  for  a  small  section  of  region  R^  . 

In  fact,  several  blocks  in  [a,b]  may  coalesce  into  connected  intervals. 
However,  this  only  increases  the  extent  of  region  R^  in  areas  that 
would  be  excluded  by  a  likelihood  ratio  test. 

The  fact  that  the  AHE-DFTR  test  docs  not,  in  general,  approach 
the  likelihood  ratio  tesl  can  also  be  demonstrated  as  follows.  When 
the  AI1E  procedure  is  used,  the  lengths  of  t lie  intervals  surrounding  all 
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observations  are  equal.  As  n  -»  ®  the  number  of  observations 
in  a  small  interval  of  length  £  around  the  maximum  of  f^(x)  is  greater 
than  the  number  of  observations  in  a  small  interval  of  length  £ 

around  any  other  point  of  f^(x)  .  Hence  if  the  intervals  surrounding  the 
P^  observations  coalesce,  they  would  most  likely  coalesce  in  the  region 
where  f^(x)  a  maximum.  Using  the  likelihood  ratio  procedure,  the 
smallest  accept  region  is  in  the  vicinity  of  the  maximum  of  f  (x)/fj(x). 

The  maximum  of  f ^ (x)  and  the  maximum  of  f^(x)/f^(x)  do  not  neces-. 
sarily  occur  at  the  same  point.  Hence,  in  general,  the  AHE  procedure 
does  not  approach  the  likelihood  ratio  procedure  as  n  approaches  infinity. 
It  is  not  known  at  this  time  how  to  determine  the  large  sample 

t 

properties  of  the  OHC  or  the  CHS  procedure.  Therefore,  an  example 
was  simulated  on  the  computer.  The  probability  densities  were  arbitrarily 
chosen  to  be 

fj(x)  =  -  —  exp  {-  j  (x  -  1 . 75)2  } 

and 

f2(x)  =  —  — -  exp  f  -  ~  (x  +  1 . 75)2  } 

>j2.v 

Equal  a  priori  probabilities  and  equal  costs  of  misrecognition  were 
assumed.  The  optimum  decision  in  a  Bayes  sense  is  a  decision  in  favor 
of  P^  if  a  new  observation  has  a  value  which  is  less  than  zero.  This 
yields  a  false  alarm  rate,  =  0.0401.  The  length  of  the  accept 

region  (R^)  to  the  positive  side  of  zero  and  the  length  of  the  accept  region 
to  the  negative  side  of  zero  were  then  found  for  samples  of  24,  49,  99, 
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25°,  500,  and  1000  observations  from  each  class  for  both  the  AHE  and 
the  CHS  procedures.  The  value  of  m  was  chosen  so  that 

E  {  P  dF  (x)  ]  -  0.0401  . 

2 

Therefore  values  of  m  of  1,2,4,10,  20,  and  40  were  used  for  the 

samples  of  24,  49,  99,  250,  500,  and  1000,  respectively. 

% 

The  accept  region  for  the  likelihood  ratio  criterion  is  the 
negative  real  line.  If  the  DFTR  procedures  are  to  approach  the  likeli¬ 
hood  ratio  procedure  ?3  n  -*  »  ,  the  length  of  the  accept  region  on  the 
positive  real  line  should  approach  zero.  The  length  of  the  accept  region 
on  the  positive  real  line  for  both  the  AHE  and  CHS  DFTR  procedures 
is  shown  in  Figure  3.10.  The  results  are  not  definitive  since  they  are 
based  on  one  trial.  However,  they  indicate  that  the  length  of  the  accept 
region  on  the  positive  real  line  approaches  some  value  other  than  zero 
for  both  L'FTR  proced  res. 

The  length  of  the  accept  region  on  the  negative  real  line  is  shown 
in  Figure  3.  11  for  both  procedures.  If  the  DFTR  tests  approach  the 
likelihood  ratio  test,  these  curves  should  continually  increase  as  n  -*  00 . 
Again  the  results  are  not  definitive.  However,  it  appears  that  the  curves 
approach  some  finite  value  rather  than  infinity. 

Both  Figures  3.10  and  3.11  indicate  that  the  CHS  procedure 
performs  better  than  the  AHE  procedure  for  this  particular  situation. 

The  length  of  the  accept  region  on  the  positive  real  line  for  the  CHS 
procedure  is  less  than  or  equal  to  the  corresponding  length  for  the  AHE 
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Number  of  observations  from  each  class 


Figure  3.12,  Disjoint  Probability  Density  Functions 


procedure  for  all  sample-6  except  n  =  250.  The  length  of  the  accept 
region  on  the  negative  real  line  for  the  CHS  procedure  is  greater  than 
or  equal  to  the  corresponding  length  for  the  AHE  procedure  for  all 
sample  sizes. 

The  ratio  of  the  length  of  the  accept  region  on  the  negative  real 
line  to  the  length  of  the  accept  region  on  the  positive  real  line  gives  some 
indication  of  the  relative  performance  of  these  procedures.  These  ratios 
labeled  for  the  AHE  procedure  and  for  the  CHS  procedure 

are  shown  in  Table  3.2  for  the  various  sample  sizes. 


Table  3.  ?.  A  Comparison  of  the  AHE  and  CIIS  Procedures 


Number  of 

T  raining 

Oh  ‘  '•  rvations 

rahe 

R 

CIIS 

49 

11.0 

1  3.  8 

99 

11.4 

12.  9 

250 

1  5.  4 

17.2 

500 

14.  2 

19.  1 

1000 

14.  0 

19.  0 

V 

One  sees  that  these  ratios  are  not  steadily  increasing  with  sample  size 
as  they  should  if  the  DFTR  procedures  are  to  approach  a  likelihood 

v  n  ’*  e  'nr-  .  Note  that  R_T,...  is  greater  than  R  ,  for  all  sample 

Glib  AHE 


S  ]  i :  S  . 
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3,7,  An  Optimum  Ordering  Procedure 


An  ordering  which  does  approach  the  Neyman-Pearson  procedure 
is  easily  obtained  if  the  class  probability  distributions  are  known.  This 
is  achieved  by  simply  using  ordering  functions  equal  to  f  (x)/f?(x). 

1  L. 

That  is,  the  following  ordering  functions  are  used: 

hj (x)  =  h2(x)  =  .  . .  =  hn  (x)  =  fjM/f^x)  .  (3.38) 

The  fact  that  a  DFTR  procedure  which  uses  the  above  ordering 
functions  approaches  a  Neyman-Pearson  test  is  easily  demonstrated. 
When  the  Neyman-Pearson  criterion  is  used,  the  accept  region  is 


R2  =  {  x  :fj  (x)/f2(x)  <  C  ) 


where  C  is  determined  so  that 

J  dF  (x)  =  a  . 
2 


Let  R'  be  the  region  obtained  when  statistically  equivalent  blocks  are 
formed  by  the  likelihood  ratio  ordering  functions  in  equation  3.  38  .  By 
the  Tchebycheff  inequality 


where 


m  is  the  largest  integer  satisfying 

m  <  (n  j  +  1 )  , 


and  the  blocks  are  given  by 


B 


1 


(x)  <  min 
1  <  i  <  n 


B  ~ 
m 


{> 


h  (x) 
m 


rn  1 i 1 


1  <  i  <  n 
i  /(!)...  (m-1) 


h 

m 


m-  1 

n 

i=  1 


i  Ail 


1 


mi  n 
£  i  <  n 


h  (x.)  =  C 
m  l  m 


i/(l  )•••  (m-D 


Si  net'  h  (x)  =  h  ^  ( x ) 


h  (x)  , 
m 


R,  -  {  x  :  h  (x)  <  C  } 
Z  m  —  m 


[  x  :  r,  (>:)/r,(x)  <  c  }. 
1  £■  —  m 


]1  now  remains  for  us  to  show'  that  C  — >  C  .  But  this  has  to  be  true 

m  p 


since 


dFj(x) 

'{x  :  f  (x)/f  (x)  <  C  } 

1  2  —  m 


/ 


dFj(x) 


{  x  :  fj  (x)/f2(x)  <  C} 


and  f  (x)  >  0  and  f^x)  >  0  for  all  x  .  Therefore  the  DFTR  procedure 
with  the  likelihood  ratio  ordering  functions  appioaches  a  Neyman- Pearson 
lest.  for  example,  let 

1.  ,2  1,  2 

--(x-g)  ,  -~(x-tp) 

f  ( x )  =  - e  and  f  ( x)  =  — e 

v’d  T1  «/2  TT 


where  p  >  0  . 
The  n 


f i (x) / f ^ ,  (x)  r  e 


2  p  x 


2)i  x 

Using  c  as  an  ordering  function,  the  blocxs  are 


A-flO 


Bl  =  <-“  X<1>] 

B2  =  (X(1)'V 


B  =  (X  X  ,] 

m  (m  - 1 )  (m) 

Therefore  R'  =  (-»,  X  ]  . 

2  (m) 

The  accept  region  to  satisfy  the  Neyman-Pearson  criterion  is 
(-00,  z)  where 

\  Z  £  (x)  dx  =  F  (z)  =  Oi  . 

By  Tchebycheffs  inequality 

^x<m>  f.(x)dx  =  F. (x  ) — >  a  . 

J_eo  1  1  (m)  P 

Since  fj(x)  >  0  for  all  x  ,  (F^(x)  is  a  monotone  increasing  function) 

X  - >  z  .  Therefore  the  accept  region  for  the  DFTR  procedure 

(m)  p 

using  the  likelihood  ratio  ordering  functions  converges  in  probability  to 
the  accept  region  for  the  likelihood  ratio  procedure. 

It  should  be  noted  that  the  AIIE,  OHC,  and  CHS  tests  approach 
Neyman-Pearson  tests  in  the  limit  if  the  class  probability  densities  are 
disjoint.  That  is, 

dF  (x)  - >  cl  (3.39) 

1  P 

and 

dF2(x)  =  1  (3.4  0) 

2 
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when 


^  f^x)  f^(x)  dx  =  0  . 
all  x 

For  example,  consider  the  configuration  of  Figure  3.12.  f  (x)  is  uniform 
over  [  0 ,  1  j  and  f^(x)  is  uniform  over  [2,  3]  .  The  region  produced 

by  the  AIIK  ordering  procedure  as  n^  and  n^  approach  infinity  is  as 
shown  in  the  figure.  It  is  easily  seen  that  conditions  (3.39)  and  (3.40) 
are  satisfied  here. 

3,  8.  A  Further  Comparison  of  the  AHE  and  CHS  Procedures 

A  further  comparison  of  the  AHE  and  CHS  procedures  can  be 
obtained  by  observing  the  average  length  of  an  interval  which  surrounds 
a  P  observation  as  the  number  of  blocks  used  to  form  the  tolerance 
region  varies.  This  is  an  appropriate  comparison  because  we  believe 
that  the  additional  volume  (length)  of  the  accept  region  that  the  CHS 
procedure  produces  over  the  AHE  procedure,  if  any,  is  located  so  that 
the  probability  of  correct  detection  increases  and  so  that  the  probability 
of  false  al  irm  stays  approximately  constant  (for  large  n).  In  fact,  the 
comparison  of  the  length  of  the  accept  region  on  the  positive  and  negative 
real  line  in  Figures  3.  10  and  3.  11  seems  to  demonstrate  this. 

1  he  average  length  of  an  interval  surrounding  a  P^  observation 
versus  the  number  of  blocks  used  to  form  R-,  is  shown  in  Figures  3.13, 

A  14,  ami  .  1  for  samples  of  2  30,  r00,  and  1000  observations.  For 
example  ,  let  us  consider  Figure  3.13.  For  10  blocks  used  to  form  R^ 
the  length  of  an  interval  surrounding  a  P  observation  for  the  AHE 
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Number  of  blocks 


Figure 


3«13»  Average  Length  of  an  Interval  Surrounding 
Observation.  (250  Observations  from  Lach  Class) 


procedure  is  .00495.  For  the  CHS  procedure,  241  of  the  intervals  have 
a  length  of  .  0095.  The  other  9  intervals  vary  in  length  from  ,0002  to 
.  0069.  This  gives  an  average  interval  length  of  .  00925. 

As  seen  in  Figure  3.13  no  benefit  is  obtained  in  using  the  CHS 
procedure  over  tne  AHE  procedure  for  m  <  6  blocks  in  .  As  more 
blocks  are  added,  the  average  length  of  an  interval  surrounding  a 
observation  for  the  CHS  procedure  becomes  larger  than  the  length  of 
an  interval  for  the  AHE  procedure.  As  still  more  blocks  are  added,  a 
point  is  readied  where  the  average  length  of  an  interval  for  the  CHS 
procedure  becomes  much  larger  than  the  length  of  an  interval  for  the 
AHE  procedure.  This  is  the  point  at  which  most  of  the  Pj  observations 
in  the  areas  where  the  P^  and  P^  observations  are  highly  confused 
have  been  ordered. 

Similar  curves  for  samples  of  500  and  1000  observations  from 
each  class  are  shown  in  Figures  3.14  and  3.15,  respectively.  These 
curves  seem  lo  indicate  that  a  good  deal  of  the  benefit  of  the  CHS  pro¬ 
cedure  over  the  AHE  procedure  had  not  been  revealed  for  m/(n+  1)  = 

0.  04  01  (m  -  20  in  the  500  sample  experiment  and  m  =  40  in  the  1000 
sample  experiment).  This  value  of  m/(n+l)  was,  of  course,  used  to 
obtain  the  results  of  Figures  3.10  and  3.  11.  If  more  blocks  had  been 
used  to  obtain  those  figures,  it  is  likely  that  a  larger  improvement  in 
the  ('.IIS  performance  over  the  AI1E  performance  would  be  noted  for 
n  =  500  and  n  =  10  00. 


A-64 


Length 
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Lumber  of  blocks 


Figure  3.14.  Average  Length  of  an  Interval  Surrounding  a  P2 
Observation,  (500  Observations  from  Lach  Class) 
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Figure 


Number  o'  blocks 


3.15.  Average  Length  of  an  Interval 
Observation.  (1000  Observations  from 
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Surrounding  a 
Each  Class) 


3.9.  Measure  of  the  Miss  Probability 

A  recognition  system  has  been  proposed  which  classifies  with 
a  given  expected  false  alarm  probability  (or  with  a  given  confidence  that 
the  false  alarm  probability  is  less  than  a  given  quantity).  It  also  cor¬ 
rectly  classifies  all  training  observations.  Nevertheless,  one  can 

find  situations  in  which  the  classifier  may  perform  poorly.  For  example, 
consider  the  two-dimensional  measurement  space  of  Figure  3.16.  A 
classifier  is  designed  using  the  AHE  procedure  for  an  expected  false 
alarm  probability  of  0.  20.  Region  consists  of  two  blocks  and  is  the 

-region  inside  the  three  circles  which  are  centered  by  the  X's.  Since 
none  of  the  regions  encircling  the  X's  are  connected,  one  feels  that 
the  miss  probability  could  be  quite  large. 

If  this  classifier  is  to  be  used  in  a  practical  problem,  a  measure 
of  the  expected  miss  error  is  needed.  Then  if  the  expected  miss  error 
is  much  larger  than  desired,  one  can  redesign  the  classifier  by  using 
more  observations,  by  using  more  P^  observations,  or  by  using 

a  different  measurement  space. 

Suppose  a  classifier  has  been  designed  by  one  of  the  methods 


previously  discussed.  The  P^  observations  have  been  ordered  with 
hyperspheres  which  expand  from  each  of  the  P^  observations.  Now 


suppose  the  observations  are  ordered,  thus  forming  blocks  with 

respect  to  F^x)  •  The  number  of  F^(x)  blocks  which  are  contained  in 


region  R^ 


can  be  counted  and  statements  such  as 
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(3.41) 


E 


df2(x)}<a, 


or 


Pr 


(3.42) 


can  be  made.  The  quantities  cj,  0  ,  and  v  are  determined  from  n^  , 
the  number  of  observations  used  to  design  the  classifier,  and  from 

b  ,  the  number  of  F^fx)  blocks  inside  R^  .  This  procedure  may  not 
be  distribution-free,  as  will  be  discussed  later  in  this  section. 

One  can  immediately  see  that  if  the  functions  for  ordering  the 
P^  observations  are  not  judiciously  chosen,  a  very’  poor  estimate  of 
the  expected  miss  probability  may  be  obtained.  For  example,  consider 
Figure  3.17. 

As  in  the  previous  figures,  the  P^  observations  are  represented 
by  O's  and  the  observations  by  X's.  The  region  ,  as  shown, 

was  constructed  for  an  expected  false  alarm  probability  of  0.1,  Suppose 
a  measure  of  the  expected  miss  probability  is  desired  for  this  classifier. 
The  X's  can  now  be  ordered  so  that  this  measure  can  be  obtained. 
Suppose  the  X's  are  ordered  by  hypersphercs  which  expand  from  all  of 
the  O's.  However,  none  of  the  blocks  which  are  formed  by  this  ordering 
will  lie  entirely  in  region  .  Since  the  theory’  of  distribution-free 
tolerance  regions  gives  no  information  about  the  cumulative  distribution 
contained  in  a  partial  block,  this  ordering  procedure  is  useless  for 
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1 


r 


making  a  statement  about  the  cumulative  distribution  F^x)  in  R^. 

Because  of  the  procedure  used  to  order  the  P  observations, 
all  observations  niust  lie  in  R^.  Hence,  all  blocks  formed  by 

ordering  the  P;  observations  must  have  subsets  which  are  contained 
in  R  .  Since  the  theory  of  distribution-free  tolerance  regions  gives 
no  information  about  the  amount  of  probability  in  a  partial  block,  good 
estimates  of  the  cumulative  distribution  F^(x)  in  R^  are  made  only 
if  the  ordering  procedure  is  such  to  allow  the  blocks  which  are  formed 
to  be  contained  in  R^  . 

Consider  the  following  procedure  for  ordering  the  P^  observa¬ 
tions.  The  procedure  consists  essentially  of  first  locating  a  P 
observation.  A  search  is  then  made  for  another  P  observation  with 
a  hyper  sphere  which  expands  from  the  first  P^  observation.  When 
the  second  P^  observation  is  found,  hyperspheres  expand  from  both 
P  observations  in  search  of  a  third  P^  observation.  This  process 
is  continued  until  n0+  1  blocks  are  formed.  The  number  of  blocks 
contained  in  R  is  counted,  thereby  giving  numbers  for  to,  0,  and  v 
in  equations  3.41  and  3.42. 

The  form  of  the  first  ordering  function  h  (x)  is  arbitrary. 

For  simplicity,  let  h  (x)  be  linear; 


h  j  (x )  -  A  x 


(3.  43) 


(2) 

v  h  re  A  is  a  vecto  r  constant.  Pet  Y  be  the  P  obse  rvation  which 
—  (1)4 


sat  i  sf  io  s 
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h  (y|2|  )  =  min  ATX(2) 

1  “(1 )  ,  —  —  1 

i  <  i  <  n2 

Here  Y^jjj  is  used  rather  than  *°  avo^  the  confusion  with  xjj| 

which  was  used  when  the  observations  were  being  ordered.  Then 

a  bl  ock  Cj  is  formed,  where 


C!  =  {x:hj(x)  <hi  (Y  j2  j  )  } 


(3.43a) 


But  is  not  contained  in  . 

Note  that  a  function  such  as 


h!  (*)  =  I  x  -  A  | 


can  be  used  as  the  first  ordering  function.  Then  may  or  may  not 

be  contained  in  depending  on  the  choice  of  the  vector  A  .  Since 

A  cannot  be  chosen  to  guarantee  that  C  R^  ,  the  estimate  of  the 

miss  probability  will  vary  for  a  given  R^  with  the  choice  of  A  . 

Moreover,  it  is  unlikely  in  an  unbounded  space  and  without  any  knowledge 

of  the  location  of  the  observations  theit  the  vector  A  will  be  chosen 

so  that  the  first  block  is  contained  in  the  bounded  region  R^  . 

To  form  the  second  block  a  search  is  made  for  a  P^  observation 

(2) 

with  a  hypersphere  which  expands  from  Y_  .  Let  the  second  ordering 
function  be  given  by 

(?)  .  (?)  . 

(3.44) 


i  /  v  (2)  »  I  v<2> 

bzife-i,!)'  =  b-l(1)  i  • 


(Z) 


Let  Y _  be  such  that 
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r 


v  ,v<2>  ,  - 

h2^— (2)  *  min  “21 
l<i  <n2 

x(2)  ,  Y<2) 

~i  ^  — (1) 


H-  <X<*>  .  X«>  ,  . 


(3.45) 


Then  the  second  block  is  given  by 


C2  =  (i  =  »*21  (x  ,  Y  g}  )  <  hz(Y  {^|  )  .  hi  (x)  >  ^  (Y  {2;  )  } 


(2) 


(2) 


(1)  7  -  2 — (2)  7  '  "l-7  '  "1-(1) 


(3.  46) 


Let  the  functions  for  ordering  the  third  P  observation  be  given  by 


h  (x,  y{2))  =  |x  -  y{2J 

3j  “  —  (j)  ~  —  (j) 


j  =  1,2  . 


(3.47) 


Let 


» - 


mm 

1  < 1  <  »2 


min 

j=l,2 


h3.(x!2\  y|2J). 

3j— i  —  o) 


x(2),y(2)  (2) 

-i  *-(!)’  —(2) 


(3.48) 


and 


K3j  =  {- :  h3j(-  '  —  (j) )  —  h3  (— (3)  )  }' 


j=1.2  • 


(3.49) 


Tlie  third  blo<  k  is  given  by 


u  k  )  nine. 
j=l  V 


i=l 


(3.  50) 


The  functions  for  ordering  the  r^  P  observation  are  given  by 


h  ,  (x,  yJ2’)=  |x  -  Y  j2’ 
r J - (j  )  ~  —  (j  ) 


j  1 , .  .  .  ,  r-1  . 


(3.  51) 


,  /v  (2)  . 

h  H  )  =  mm 
r  —  (r  , 

i<.  <n2 


mm 


1  <  j  <  r-1 


.»  ,(x!2>  y<2>, 

rJ  —  i  ~(j) 


x(2Ly(2)  Y  (2) 

—  l  —(1)  —(r-1) 


(3.  52) 
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and 


K 


rj  L-  r j  —  '  — (j ) 
Then  the  r^"1  block  is  given  by 


j-  1  ,  .  .  .  ,  T  - 1 


(3.  53) 


r- 1  r-  1 

c  =  (  u  k  )  n  (  n  c. 
r  V  rj  J  V  .  a 

J=1  i=l 


(3.54) 


If  the  AHE  procedure  has  been  used  to  design  the  classifier,  the 

regions  L  (equation  3.16)  have  the  same  radius  for  a  particular  k 
jk  r 

and  for  all  r  =  l ,  .  .  . , n^  .  Suppose  m  blocks  were  used  to  form  . 

If 

h2(-(2)  )  -  dm  —  (m)  J  ’ 

the  entire  block 

C2  =  fi:h21(x,vJ|)<h21(Ig),h1W>hI(Yj|)] 

is  contained  in  .  For  the  general  case  it  is  easily  seen  that  if 

•  <3'  5> 


the  entire  block  C.  ,  i  =  l , . .  .  ,  n  ,  is  contained  in  R?  . 

1  6  w 


It  is  also  seen 


that  if 


•I  /-y  (^)  \  w  J  /  y  ^  )  \ 

h.  (Y  )  >  d  (X^  .  )  i 

i— (l)  m  —  (m) 


(3.  56) 


the  block  Ch  may  or  may  not  be  contained  in  R^  .  If  equation  3.  56  is 

satisfied,  one  must  find  a  point  in  C.  which  does  not  lie  in  R?  to  show 

X  c* 

that  C R_  .  However,  to  show  that  C.  is  contained  in  ll_  one 
12  i  2 

must  show  that  all  points  which  lie  in  C.  also  lie  in  R?.  This  is,  of 

X  c* 
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course,  no  easy  task.  A  conservative  statement  about  the  miss  probability 
can  be  made  by  counting  the  number  of  times  equation  3.  55  is  satisfied 

fo  r  i  1  ,  .  .  .  ,  . 

Let  the  tofal  number  of  blocks  contained  in  (from  equation 

3 .  5 5 )  be  b  .  Then 


dF2(^)  }  <  ~1T7i 

R2 

o 

and  the  variance  <jc  is  given  by 

b(n^4  1  -  b) 
(n2+  1  )2(n2+  2) 
(b  +  l)(n2-b) 

(n  +  1  )Z(n7+  2) 


(3.  57) 


n  +  1 

b  >  — — 


b  < 


V1 


2  ' 


The  values  of  0  and  i>  in  equation  3.42  can  be  found  by  consulting 


Murphy's  graphs  (1948). 


Sup;*  f  equation  3.43  is  used  to  fori  the  first  block  .  This 

block  is  not  contained  in  .  Suppose  also  that  R^  is  bounded.  Note 

that  R^  is  always  bounded  except  in  the  trivial  case  where  R^  is  the 

entire  measurement  space.  If  R-,  is  bounded,  the  last  block  C  .  is 

2  »2+  * 

not  contained  in  R  .  Therefore,  under  normal  conditions, 

b  <  n2  -  1  . 

Thi.  1  is  l<  .t  n.  ini  mum  expo  .ted  miss  probability 


r  r  t  2 

E  \  dF  (X)  ;  >  — T 

L  j_  2  -  J  —  n  +  1 

R. 


(3.  58) 
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To  be  certain  that  the  blocks  are  distribution-free  ,  the 
ordering  functions  and  region  should  be  chosen  before  the 

observations  are  taken.  This,  of  course,  can  not  be  done  here  because 
is  determined  from  the  P^  observations.  Hence  this  ordering 
procedure  may  produce  an  estimate  of  the  probability  of  a  miss  which 
is  not  distribution-free.  This  procedure  is  used  here  only  for  obtaining 
a  rough  estimate  of  the  probability  of  a  miss,  so  that  the  classifier  can 
be  redesigned  if  the  estimate  is  much  poorer  than  the  desired  miss 
probability. 

We  expect  the  estimate  to  perform  this  function  adequately 
because  it  measures  how  well  the  hyperspheres  making  up  are  con¬ 
nected.  For  example,  any  time  a  block  is  counted  as  contained  in 

R^  by  equation  3.  55  we  know  that  one  of  the  hyperspher^  j  making  up 
R^  contains  the  center  of  another  of  the  hyper  spheres  making  up  R^. 
These  two  hyper  spheres  are  certainly  connected. 


The  OHC-R  and  CHS-R  Procedures 

If  the  OHC  or  the  CHS  procedure  has  been  used  to  design  the 
classifier,  all  of  the  hyperspheres  which  make  up  do  not  have  the 

same  radius.  Therefore,  the  number  of  blocks  contained  in  R^ 
varies  with  the  P^  observation  with  which  the  ordering  starts.  Further¬ 
more,  with  classifiers  of  the  OHC  and  CHS  designs,  the  i^h  block  is  not 
necessarily  contained  in  R^  when 


h  (Y{Z)Xd  (X{1)  \ 
hi  —  ( i )  —  m  —  (m) 
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Figure  3.18  illustrates  this  fact  with  the  X's  representing  the  obser¬ 
vations.  Suppose  is  the  union  of  the  area  inside  the  four  hyperspheres 

which  are  shown  with  thick  lines.  The  boundaries  for  the  C.  blocks  are 

1 

shown  with  narrow  lines.  Suppose  the  ordering  function  for  ordering  fhe 
1st  observation  is  given  by  h^(x)  =  x,  .  Then  the  observation  X1  in  the 


1 


1 


.(2) 


figure  is  the  first  ordered  P  observation,  Y  .  A  hypersphere  is 

c  ( 1 ) 

then  allowed  to  expand  from  X  in  search  for  a  new  P  observation. 

X 

(2) 

The  observation  Y'  =  X  is  located  and  block  C?  is  defined.  Block 

Cv  is  contained  in  R  .  Now  hyperspheres  expand  from  X  and  X  in 

(2) 

search  of  a  new  P  observation.  The  observation  Y  =  X  is  located. 

Z  (3)3 


Note  that 


MY<2»)<d  (X*11.). 

3  —  (3)  -  m  —  (m) 


However,  this  block  is  not  contained  in  R^  because  an  area  outside  of 
the  hypersphere  which  makes  up  and  surrounds  X^  (tlxe  shaded 

area)  is  included  in  block  C^. 

We  reason  here  that  our  foremost  concern  is  that  we  measure 
whether  the  hyperspheres  are  connected.  Using  this  reasoning,  we  should 
use  the  radii  of  the  hyperspheres  which  make  up  R^  in  ordering  the 
observations.  These  ordering  procedures  are  called  the  OHC-R  and  the 
CIIS-R  procedures.  The  ordering  functions  are  given  by 


h  .  (x,y[2')  = 

O - (j  ) 


d  (X«"., 
m  —  (m) 

d.(Y<2') 

J  “(j  ) 


v(2) 


(3.  59) 
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where*.  d^(x|^  )  is  the  radius  of  the  largest  hypersphere  making  up 

(2) 

and  cR(Yj  ^  )  is  the  radius  of  the  hypersphere  making  up  and  centered 

(2) 

by  Y ^ ^  .  This  equation  simply  alters  the  rate  of  expansion  of  the  hyper- 

(2) 

sphere  from  the  observation  Y  so  that  a  hypersphere  expanding  from 
any  other  observation  reaches  the  hypersphere  of  which  was 

centered  at  that  observation  at  the  same  time  as  the  hypersphere  which 


expands  from  Y*  reaches  the  hypersphere  of  R  which  was  centered 

\  J  )  ^ 


at  Y 


(j)  * 


■  3,  10.  An  Ordering  Procedure  which  Gives  Distribution-Free  Measures 
of  Both  the  False  Alarm  and  Miss  Errors 

It  was  concluded  in  the  last  section  that  the  procedure  given 
there  for  obtaining  a  measure  of  the  miss  probability  may  not  be  independent 
of  the  disli  ibution.  An  ordering  procedure  is  presented  in  this  section 
which  does  give  distribution-free  measures  of  both  the  miss  probability 
and  the  fal  >e  alarm  probability.  However,  the  ordering  procedure  can 
at  times  yield  very  poor  classification  regions  as  will  be  seen  later. 

As  before,  we  wish  to  fix  the  confidence  that  the  false  alarm  rate 
is  less  than  a  given  quantity.  To  obtain  a  distribution-free  estimate  of 
both  the  miss  probability  and  the  false  alarm  probability,  we  must  specify 
Hu  ord  -ring  r  ''on:;  Imfor'*  the  outcome  i  c  ’he  observations  is  known, 
c.f.  Murphy  (1918).  ,phis  can  be  doi  e  by  using  the  ordering  functions  of 
equations  3.43  through  3.54.  However,  now  these  functions  (with  one 
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exception)  are  used  to  simultaneously  order  both  the  P  and  the  P 

X  w 

observations.,  The  exception  is  the  first  ordering  function  which  is  used 
to  order  a  P^  observation.  The  first  block  is  assigned  to  =  X-R^  . 
The  remaining  blocks  are  assigned  to  region  R^  until  the  number  of 
Pj  blocks  assigned  to  R^  is  m  .  The  number  m  is  determined  from 
the  desired  false  alarm  probability.  We  will  illustrate  the  ordering 
procedure  by  the  two-dimensional  example  of  Figure  3.  19.  Here  the 
X's  represent  the  P  observations  and  the  O's  the  P  observations. 

bf  1 

The  number  of  P^  observations  is  6  ;  the  number  of  Pj  observations 
is  9  •  Suppose  that,  m  =  1  gives  the  expected  false  alarm  probability 
that  is  desired.  The  first  ordering  function  is  used  to  order  a  P^  obser 
tation.  Suppose  A  in  equation  3.43  is  chosen  so  that 


hj(x)  =  (1  0)  /  X]  ^ 

\  ) 


The  first  block  =  [x  :  hj  (x)  <  h^  (X^  ^  )}  (see  Fig.  3.  1  9)  is  assigned 
to  R  =  X  -  •  The  ordering  is  continued , with  equations  3.44  through 

J.  b> 

3.  54  being  used  to  order  both  the  P^  and  P^  observations.  This  order 
ing  proceeds  as  follows.  A  hypersphere  expands  from  X  search  of 

an  observation  from  either  class.  The  block  is  to  be  added  to  R_ 


regardless  of  whether  the  observation  is  a  P^  observation  or  a  P 

observation.  Luckily,  in  our  example  a  P_  observation  X  is  found. 

c.  (4 ) 

Next,  hyperspheres  expand  from  both  X  and  X  in  search  of  an 

\M  K) 

observation  from  either  class.  A  P_  observation  X  is  found.  One 

4  (a) 
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may  trace  through  the  remaining  ordering  and  find  that  X.  ,  X.  ,  X  . 

(4)  (5)  (6) 

and  are  located  in  that  order.  The  region  is  then  the  shaded 

region  of  Figure  3.  19.  It  contains  one  complete  block,  five  complete 

P  blocks,  and  part  of  another  block.  Hence  the  expected  false  alarm 

probability  is  1/10  and  the  expected  miss  probability  is  less  than  2/7. 

Note  that  the  ordering  procedure  illustrated  above  will  generally 

yield  poor  results  if  F  (x)  is  a  multimodal  distribution.  This  is  easily 

seen  by  noting  that  a  cluster  of  X's  in  the  lower  right  corner  of  Figure 

3.  19  would  not  be  included  in  R  before  O.  is  found  and  the  ordering 

L  ( 1 ) 

is  curtailed.  However,  one  does  know  from  the  number  of  P^  blocks 
m  R^  that  the  classification  regions  will  yield  a  poor  miss  probability. 
Then  seme  other  classification  rule  can  be  used. 

The  ordering  procedure  can  be  changed  so  that  it  is  more 
acceptable  for  multimodal  situations.  For  example,  if  at  any  time  in  the 
process  (before  m  P^  observations  are  ordered)  more  P^  blocks  are 
being  formed  than  P^  blocks,  a  hyperplane  can  be  used  to  search  for 
a  P^  observation.  The  P^  block  which  is  formed  is  assigned  to  region 
Rj  .  A  hypersphere  is  then  allowed  to  expand  from  the  new  P^  observa¬ 
tion  and  blocks  are  agaiii  added  to  region  R^.  We  have  essentially 
searched  for  another  mode  of  the  F^x)  distribution. 

f 

The  procedure  presented  in  this  section  is  somewhat  different 
from  the  AHE,  OHC,  and  CHS  procedures  presented  in  section  3.4.  The 
procedure  of  this  section  gives  distribution-free  measures  of  both  the 
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miss  probability  and  the  false  alarm  probability.  However,  it  does  not 
maximize  the  number  of  training  observations  which  are  recognized 

as  numbers  of  class  2.  Furthermore,  there  may  be  some  difficulty  in 
mal  ing  this  procedure  work  well  for  multimodal  distributions.  For  this 
reason,  the  procedure  was  not  used  in  the  automatic  speaker  verification 
experiment  of  the  next  chapter. 

3.11.  Multi-class  Problem 

The  extension  of  the  ordering  procedures  of  this  chapter  so  that 
they  are  applicable  for  more  than  two  classes  is  investigated  in  this 
section.  Suppose  there  exist  I<  classes  where  K  >  2  .  Using  distribu¬ 
tion-free  tolerance  regions,  Cuesenberry  and  Gessaman  (1968)  propose 
a  classifier  in  which  all  probabilities  of  misclassification  are  specified. 
This  approach  is  permitted  through  the  use  of  a  region  in  which  no 
decision  is  made.  This  region  will  be  called  the  rejection  region.  To 
see  how  this  approach  differs  from  the  hypersphere  DFTR  approach, 
consider  a  2  class  example.  Suppose  the  distributions  are  bivariate  and 
unimodal.  Ouesenberry  and  Gessaman  suggest  that  a  reasonable  ordering 
might  be  one  which  yields  a  bounded  convex  region  for  each  class. 

Suppose  the  ordering  of  Figure  2.  1  is  used  to  order  both  the  P ^  and  the 
P  observations .  The  resulting  decision  regions  might  appear  as  shown 

L 

m  Figure  •>.  Region  R  ^  is  the  region  obtained  by  ordering  the  P^ 

observations  and  R^  is  the  region  obtained  by  ordering  the  P^  observa¬ 
tions.  The  decision  rule  is  such  that  a  new  observation  V  is  classified 
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x2 


Figure  3«20.  A  Two-Class  Hyperplaiie  Approach. 


Figure  3*21,  A  Two-Class  Hypersphere  Approach, 
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as  a  Pj  observation  if  V(  {Rj-  (RjH  R2I)  .  It  is  classified  as  a 
observation  if  V  c  {R^-  (R^H  R^)}.  No  decision  is  made  if  Vc{(R^nR^) 

U(  R.  R^))  .  The  difficulty  with  this  procedure  is  that  a  large  rejection 

1  l- 

region,  (R  flR  )  U  (R  f)  R  )  f  may  result  according  to  the  ordering 

l  Lt  A  C* 

functions  chosen  and  the  proximity  of  the  classes.  Note  also  for  a  many- 
variate  problem,  an  ordering  like  the  one  of  Figure  3.  20  requires  many 
blocks  tc  be  removed  from  both  Rj  and  if  R^  and  R^  are  to  be 

closed  regions. 


The  conditional  probabilities  of  error  are  specified  as  follows. 
Let  the  probability  that  V  is  classified  as  a  P  observation  when  it  is 
a  P.  observation  be  denoted  by  p(j/i)  where  i=l,2,  and  j=l,2.  Let 


R  anti  R  be  such  that 
I  £ 


Pr  {  [_dF  (x)  <r\  ~-y 


and 


(3.  60) 


Note  that 


p(2/l)  --  \  dFj(x)  (3.61) 

WL 

and 

l>()/2)  =  {  JF2(x)  . 

WR2 


Sincr  l  K  -  K  l  I  R  )  -  R  and  (R  -  R  (1R  )CR  f 

l  L-  A  1  1  A- 

p ( 2 / 1 )  <  'T  d  Fj  (x)  (3.62) 
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and 


p(l/2)  <  C_  dF  (x)  . 

P2 

Thcr  (3.63) 

Pr{p(2/1)  <8j  }  >yj 

and 

Pr{p(l/2)  <  Bz )  >rz  . 

Of  course,  expanding  hyperspheres  can  be  used  for  ordering  both 
the  Pj  and  observations  when  one  wishes  to  specify  both  probabil¬ 

ities  of  error  and  when  one  is  willing  to  accept  a  region  in  which  no  deci¬ 
sion  is  made.  For  example,  see  Figure  3.21.  Here  the  hyperspheres 
(circles  in  the  figure)  expand  from  the  P^  observations  and  order  the 
Pj  observations,  thus  forming  .  Also,  hyperspheres  expand  from 
the  P  observations  and  order  the  P  observations,  thus  forming  R  . 

1  w  1 

Suppose  Rj  and  R^  are  such  that 

Pr[f  dF(x)<i3  ]>v 

h 

and  (3. 64) 

Pr{^  dF2(x)<02}=r2. 

R1 

Since  (R^  -  R^fl  R^)  c  Rj  and  (R^-  R->)  C  ^2  '  eclua*d°n  3.63  is  also 

satisfied  by  the  hypersphere  DFTR  approach.  Note  that  equation  3.  64  can 
be  rewritten  as 
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Pr 


{[[  dF(x)  +  C  dF(x)l<fl}  =  y 

Jd  r>  n  r>  1  «Jr>  ad*  A  J  * 


V  V  R2 


VR2 


a  iid 


Pr 


VVR2 


^  +L*r^82  Wa¬ 


v’ Rz 


(3.  65) 


Hence  equation  3.63  is  satisfied  if  all  or  part  of  the  region  R^flR^  is 
adjoined  to  either  R^  or  R^  .  This  means  that  there  is  no  need  to 
designate  R  HR^  as  a  rejection  region  when  using  the  expanding  hyper¬ 
sphere  approach.  Another  decision  rule,  for  example  the  nearest-neighbor 
rule,  P  ix  and  Hodges  (1931),  can  be  used  to  classify  an  event  which  occurs 
in  region  R^fl  R,  .  In  this  case  the  decision  rule  is 


if  v  c  r7-  r, n  R 

or  if  V  c  R  0  R_  and  min  |V  -  xf  ^  |>  min  |V  -  x[^  I 
—  1  Z  ^  —  —  i  —  ^  —  —i 

if  V  c  R  -  R  D  R  (3.  66) 

cl  *■  x.  c* 

or  if  V  (  R  fl  R  and  min  | V  -  xf  ^  I <  min  |V-  xf^  I 
1  Z  i  1  i  1 

d  if  Vf  Rjfl^ 


where  d^  means  the  decision  is  made  that  V  is  a  observation, 

d  ;  means  the  decision  is  made  that  V  is  a  P^  observation,  and  d^ 
means  that  no  decision  is  made. 

On  t  i  . .  -r  ha;  1,  for  the  ordering  procedure  of  Figure  3.20,  the 
■  r.j  'inir.g  of  ti  region  R  H  R  to  R  or  R  also  satisfies  equation 

i  C*  1  C* 

3.  63.  This  follows  because  R^  and  R^  were  constructed  to  satisfy 
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equation  3.60.  Then  R  fl  R  need  not  necessarily  be  designated  a  rejec- 

i  M 

tion  region  for  the  procedure  in  Figure  3.20*  Note  also  that  this  proce¬ 
dure  need  not  necessarily  yield  a  rejection  region  at  all  if  R  fl  R  =  0  , 

1 

the  null  set.  This  is  not  the  case  for  the  hypersphere  procedure  of  Figure 
3.21.  This  follows  because  the  hyperspheres  are  bounded  by  the  obser¬ 


vations  of  the  opposite  class.  Hence  R  ft  R  -  0  only  if  R  =  0  or  R  =  0, 

1  £  1  6 

a  trivial  case. 


Now  consider  the  hype r sphe re-DFTR  approach  for  K  >  2  classes. 
A  multiclass  decision  rule  of  the  Neyman-Pcar son  type  is  used.  The 
particular  ordering  procedure  to  follow,  and  ultimately,  whether  a  rejec¬ 
tion  region  is  needed  or  not,  depend  on  the  desired  outcome  of  the 
classifier.  For  example,  consider  the  three  class  problem.  Let  the 
probability  that  V  is  classified  as  a  P.  observation  when  V  is  a  P. 
observation  be  denoted  by  p(j/i),  i  =  1,2,3,  j  =  1,2,3  .  There  are  nine 
conditional  probabilities  of  classification  which  obey  the  following  three 
equations  . 

p(l/D  +  p(2/l)  +  p(  3  / 1 )  =  1 

p(l  / 2 )  +  p(2/2)  +  p ( 3 / 2 )  =  1  (3.67) 

p  ( 1  /  3 )  +  p(2  /  3)  +  p(3/3)  =  1 


For  some  problems  the  following  criterion  may  be  desirable. 


(a)  Pr{(p(2/1)  +  p(3/l)j  <  /3j  }  > 

(b)  P r { [ p ( 1  / 2 )  +  p(3/2)J  <BZ)  >7Z 


(3.  68) 


(c)  maximize  p( 3/ 3) ;  (which  also  minimizes  p ( 1  / 3 )  +  p(2/3)  ) 
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By  analogy  to  the  2  class  hypersphere  DFTR  approach,  hyperspheres 

expand  simultaneously  from  the  and  the  P  observations  to  order 

the  Pj  observations.  Also,  hypersphercs  expand  simultaneously  from 

the  P.  and  P„  observations  to  order  the  P^  observations.  Let  the 
1  2 

Pj,  P^.  and  P^  observations  be  represented  in  Figure  3.22  by  O's, 

X's,  and  *'s,  respectively.  Regions  r!  and  R*  are  formed  about  the 

P ^  and  P^  observations,  respectively,  when  a  P^  observation  is 

2  2 

ordered.  Regions  R^  and  R^  are  formed  about  the  P^  and  P^  obser¬ 
vations,  respectively,  when  a  P^  observation  is  ordered.  Since  one 
would  like  to  maximize  p(3/3)  and  satisfy  (a)  and  (b)  in  equation  3.68, 
the  following  decision  rule  is  a  logical  choice. 

i 

if  y  €  R3  n  R^ 
if  v  €  R*  n  Rj  n  r|  n  r23 

if  V  e  R^  HR*  n  rJ  (1  (3.  69) 

if  v  c  Rj  n  r^  fi  r*  n 
if  v  e  Rj  n  r|  -  r|  n  r^  n  Rj  n  r|  (e) 

Note  that  a  rejection  region  is  necessary  in  this  case.  The  crosshatching 
in  Figure  3.21  illustrates  the  classification  regions  for  this  decision  rule 
\  it >i  tin  neit  rest-neighbor  rule  being  used  in  place  of  condition  (e)  above. 

In  the  region  with  vertical  c  rosshatching ,  the  decision  is  made 
in  favor  of  class  3.  In  the  region  with  northeast  (NE)  crosshatching,  the 
decision  is  made  in  favor  of  class  1.  In  the  region  with  northwest  (NW) 
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crosshatching,  the  decision  is  made  in  favor  of  class  2.  The  region 
with  no  crosshatching  is  the  rejection  region.  Therefore,  the  decision 
rule  is 

d^  if  V  e  (the  region  with  vertical  crosshatching) 

d^  if  V  e  (the  region  with  NW  crosshatching) 

(3.  70) 

d^  if  V  e  (the  region  with  NE  crosshatching) 

d  if  V  c  (the  region  with  no  crosshatching) 


Generalising  this  approach  to  a  problem  with  K  classes,  the 
c  rite  rion  is 

Pr{  £  p(i /  1 )  <  0.  }  >  y. 
i  1  1 

i^l 


Pr{  £  P(i/2)  <b2)  >yz 
in 

;  (3.7i) 

P  {  S  p(i/K-l)  < >  yK  ] 
i/K-1 

maximize  p(K/K)  . 


Jt  is  seen  that  the  P  observations  are  encircled  by  K-l  region 


2  K  1 

R  ,  R  ,  .  .  .  R  and  the  other  observations  are  encircled  by  K-2 

K  K  K 


regions.  The  resulting  decision  regions  are  complex  and  probably  very 


an  i 


:e  this  do  :s  not  seem  to  be  a  desirable  ordering  pro¬ 


cedure  for  K  classes. 
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We  now  consider  an  ordering  procedure  which  is  better  suited  to 
the  hypersphere  DFTR  approach.  In  this  procedure,  a  hypersphere 
expands  from  each  observation  of  a  particular  class  and  orders  the  obser¬ 
vations  of  all  other  classes.  T  ms  can  be  done  for  all  K  classes.  Hence 
following  probabilities  of  error  art  specified. 

Pr[  Z  p(l/i)  <  fij)  >  y, 

ih 

Pr{  Z  p(2/i)  <  Bz)  >  yz  (3.72) 

ij 12. 

Prf  Z  p(K/i)  </3k)  >yK 
i/K 

Figure  3.  23  illustrates  this  procedure  for  the  two-dimensional  case, 
for  K  =  3  classes,  and  for  the  same  samples  as  shown  in  Figure  3.22. 
Region  R^  is  obtained  by  hyperspheres  which  expand  from  the  P^  obser¬ 
vations  to  order  the  and  P^  observations.  The  region  in  the 

figure  is  completed  when  either  a  P_  or  a  P,  observation  is  found. 

M  •  / 

Likewise,  region  R^  is  completed  when  the  hyperspheres  expanding  from 
the  observations  intersect  either  a  P^  or  a  P^  observation. 

Region  R^  is  completed  when  the  hyper  spheres  expanding  from  the  P  ^ 
observations  intersect  either  a  or  a  P^  observation.  The  decision 

rule  is  given  by  equation  3.  70. 

A  criterion  which  is  well  suited  for  the  hypersphere  DFTR  approach 
is 
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(3.  73) 


Pr{  p  p(i/K)  <  /3k)  >  yK 

ih 

maximize  p(  1  / 1 ) 
maximize  p(2/2) 

maximize  p(K-l/K-l) 

Here  hyperspheres  expand  simultaneously  from  the  P  ,  P  ,  .  .  .  ,  P 

1  2  K- 1 

observations  to  order  the  P  observations.  An  example  for  the  3  class 

K. 

problem  is  shown  in  Figure  3.24  for  the  same  samples  as  Figures  3.22 
and  3.23.  Note  that  a  rejection  region  is  unnecessary  for  this  case. 

If  one  wishes  to  specify  the  individual  errors,  p(i / j)  ,  the  following 
criterion  works  well  with  the  hypersphere  DFTR  procedure  and  is  easily 
extendable  to  K  classes. 


Pr{p(3/2)  ^  jSj  3  >Xj  , 

maximize 

p(2/2) 

Pr{p(2/1)  <0^  >yz  , 

maximize 

P(l/D 

(3.74) 

Pr{p(l/3)  </?3}  >  y3  , 

maximize 

p{  3/3) 

This  case  is  represented  in  Figure  3.25.  The  decision  rule  is 

d3  if  vc  R3n  Rj  n  r2 

d  if  V  e  R  n  R  n  R 

_  _  (3.7 

d  if  ve  r  n  r3  n  r2 

d^  othe rwise  . 

One  may,  of  course,  use  the  nearest-neighbor  rule  in  R^fl  R2>  R^fl  R3> 
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r 


and  R2D  as  shown  in  the  figure. 

In  summarizing  the  utility  of  the  hypersphere  DFTR  approach  in 
the  I\  class  problem,  it  may  be  stated  that 

(1)  If  one  wishes  to  specify  the  conditional  probabilities  of  error, 

£  p(i/l),  L  p(i/2),  .  .  .  ,  £  P(i/K)  where  K  is  large  and  if  one  is  willing 
i  i  1 

i?U  i^Z  i^K 

to  tolerate  a  rejection  region,  another  ordering  procedure  (for  example 
that  of  Figure  3.20)  might  be  more  appropriate  than  the  hypersphere 
DFTR  procedure.  This,  of  course,  depends  on  the  proximity  of  the 
classes,  the  modality  of  the  class  distributions,  and  the  dimens ionality 
of  the  space. 

(2)  For  other  situations  the  hypersphere-DFTR  approach  might  be 
more  appropriate  because 

(a)  The  rejection  region  is  unnecessary  for  certain  criteria. 

(b)  The  classification  regions  assume  the  shape  of  the  observa¬ 
tions;  hence  the  approach  works  well  for  multimodal  class  probability 
distributions . 

(c)  The  minimum  number  of  blocks  and  hence,  the  specification 
of  error  rates  is  not  dependent  on  the  dimensionality  of  the  space,  if  a 
bounded  region  is  desired. 

A  comment  should  be  made  about  item  (c)  above.  One  may  use  a  hyper- 
spherc  ordering  where  the  hypersphere  contracts  around  each  class.  In 
this  case  the  approach  is  independent  of  the  dimensionality  of  the  space 
and  is  cminantly  suited  for  the  criterion  given  in  equation  3.71. 
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The  problem  is  about  what  point  or  points  should  the  hypersphere(s)  be 
centered.  Of  course,  one  could  use  the  median  of  a  certain  variate  to 
center  a  contracting  hypersphere  .  But  many  times  this  is  not  suitable 
for  that  median  does  not  "center"  the  class.  One  may  consider  using 
the  sample  mean  of  the  observation-.  However,  distribution-free  tolerance 
regions  are  not  formed  for  all  distributions  when  the  sample  mean  is  used. 
Nevertheless,  if  the  class  of  probability  distributions  governing  the 
populations  is  restricted,  the  sample  mean  can  be  used.  For  example 
McKay  (1935)  showed  that  in  a  normal  population,  the  order  statistics 
measured  from  the  sample  mean  are  distributed  independently  of  the 
sample  mean. 

Summary 

A  classification  procedure  has  been  presented  which  seems 
reasonable  when  nothing  is  known  about  the  class  probability  distributions 
and  when  it  is  desirable  to  specify  some  of  the  conditional  probabilities 
of  error.  It  is  assumed  that  a  properly  classified  sample  of  independent 
observations  is  available  from  each  class.  This  approach  is  advocated 
because: 

(1)  Appropriate  decision  regions  are  formed  for  multimodal 
probability  distributions. 

(2)  The  approach  is  independent  of  the  dimensionality  of  the 
sample  space. 

(3)  The  approach  is  very  simple  to  program  on  a  digital  computer. 
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(4)  Automatic  data  reduction  results  when  this  approach  is  used. 

(5)  No  rejection  region  is  required  for  certain  error  criteria. 

(6)  The  approach  indicates  when  the  classification  system  should 
be  redesigned  because  of  expected  probabilities  of  error  which  arc  too 
large . 
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Chapter  4 


AUTOMATIC  SPEAKER  VERIFICATION 

4.  1.  Introduction 

The  purpose  of  this  section  is  to  report  on  an  automatic  speaker 
verification  system  and  its  use  in  testing  the  hypersphere  DFTR  class¬ 
ification  schemes.  A  speaker  verification  system  is  one  which  tests 
the  purported  identity  of  a  speaker  from  a  sample  of  the  speaker's  voice. 
An  automatic  speaker  verification  system  accomplishes  this  without 
any  human  intervention  in  the  decision  process.  For  example,  suppose 
an  automatic  speaker  verification  system  is  to  be  used  for  allowing  a 
person  entrance  through  a  company  gate.  The  test  subject  might  be 
required  to  push  a  button  beside  the  name  of  the  person  he  purports  to 
be.  The  test  subject  then  says  a  required  phase,  for  example,  "My 
name  is  speaker  X.  "  A  computer  then  identifies  him  as  the  main 
speaker  (the  speaker  whose  identity  the  test  subject  has  assumed)  or  an 
impostor. 

This  paper  uses  the  terms  speaker  verification  and  speaker 
recognition  in  accordance  with  their  use  in  the  literature.  In  a  speaker 
verification  system  the  decision  is  made  whether  the  test  speaker  is 
Speaker  1  or  not  Speaker  1.  In  a  speaker  recognition  system  the  decision 
is  made  whether  the  test  speaker  is  Speaker  1,  or  Speaker  2,  .  .  .  ,  or 
Speaker  K. 
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It  is  assumed  in  this  experiment  that  the  test  speaker  desires 
recognition  as  the  main  speaker.  However,  no  minicry  was  involved 
in  the  sc  tests. 

4.2.  Speaker  Recognition  Review 

The  relationship  between  speaker  recognition  —  the  recognition 
of  the  identity  of  a  speaker  from  his  speech  —  and  speech  recognition  — 
the  recognition  of  the  content  of  the  speech  no  matter  who  is  talking  — 
is  quite  interesting.  In  the  first  case  it  is  the  similarity  in  the  voices 
that  makes  the  recognition  process  difficult  while  in  the  latter  case  it 
is  the  difference  in  the  voices  that  makes  the  recognition  process  difficult. 
Hopefully,  what  is  learned  about  speaker  recognition  can  be  employed 
to  improve  speech  recognition.  The  ideal  speech  processor  is  one  which 
can  extract  the  differences  in  the  voices  for  speaker  identification  and 
use  the  similarities  in  the  voices  for  speech  recognition. 

The  literature  contains  many  papers  on  speaker  recognition. 

In  some  of  the  papers  the  recognition  is  done  by  humans,  c.f.  Pollack 
et  al.  (1954);  in  some  of  the  papers  by  a  combination  of  humans  and 
machine,  c.f.  Kersta  (1962a)  (1962b);  and  in  some  of  the  papers  by  machine 
alone,  c.f.  Luck  (1969). 

Pollack,  Pickett,  and  Sumby  (1954)  tested  the  ability  of  humans 
to  identify  speakers.  Emphasis  was  placed  on  recognition  accuracy 
versus  duration  of  speech.  For  8  different  male  speakers,  the  listeners 
were  able  to  correctly  identify  the  speakers  70%  of  the  time  when  a 
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m  on. j  syllabic  word  was  spoken.  Recognition  rates  increased  to  80%  for 
a  speech  sample  of  .65  seconds  duration  and  to  90%  for  a  speech  sample 
of  1  second  duration. 

Compton  (1963)  studied  the  ability  of  15  listeners  to  recognize  9 
speakers  from  recordings  of  vowels  as  the  vowel  duration  was  varied. 

The  recognition  rates  for  the  vowel  with  IPA  (International  Phonetic 
Association)  symbol  /i /  ranged  from  36%  for  a  speech  duration  of  25 
milliseconds  to  57%  for  a  speech  duration  of  1500  milliseconds.  He 
found  that  a  shorter  bandwidth  required  a  greater  duration  of  speech  for 
.the  same  recognition  rate.  He  also  found  that  attenuation  of  the  frequen¬ 
cies  below  1020Hz  did  not  affect  the  ability  of  the  listeners  to  recognize 
the  speakers. 

Bricker  and  Pruzansky  (1966)  conducted  a  speaker  recognition 
experiment  with  10  speakers  and  16  listeners,  all  of  whom  had  worked 
together  for  at  least  two  years.  They  used  excerpted  vowels,  consonant- 
vowel  sequences,  monosyllabic  words,  disyllabic  nonsense  words  and 
sentences.  One  of  their  results  was  that  the  recognition  rate  improved 
directly  with  the  number  of  phonemes  in  a  speech  sample,  even  when 
the  duration  of  the  speech  sample  was  controlled.  The  recognition 
accuracy  ranged  from  56%  for  vowels  of  117  milliseconds  duration  to 
98%  for  sentences  of  2.4  seconds  duration.  They  also  reported  on  a 
computer  recognition  system  which  used  60  measurements  of  each  vowel. 

Here  they  obtained  a  recognition  rate  of  79%.  This  is  a  2  3%  improvement 
over  the  result  obtained  by  the  listeners  for  this  short  duration  speech 
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segment. 


L.  G.  Kersta  (1926a),  (1962b),  (1965),  (1966)  has  employee:  many 
techniques  for  extracting  speaker  identity  from  spectrograms  or 
"voice  prints"  as  he  calls  them.  Spectrograms  are  two-dimensional 
pictures  of  the  speech  showing  the  speech  magnitude  versus  frecuency 
and  time.  This  is  done  in  two  dimensions  by  allowing  the  blackness  of 
the  spectrogram  to  be  proportional  to  the  magnitude  of  the  speech. 

Kersta  achieves  remarkable  recognition  rates.  Some  examples  are: 

97%  or  better  in  (1962a),  99%  in  (1962b),  96%  in  (1965),  better  than  90% 
for  120  speakers  in  (1966).  The  references  (1962a)  and  (1962b)  report 
on  the  training  of  people  to  identify  the  speakers  from  "voice  prints*" 

The  references  (1965)  and  (1966)  deal  with  computer  recognition  of  the 
speakers  from  "voice  prints." 

Kersta's  work  has  been  criticized  by  various  authors.  Ladefoged 
and  Vandcr slice  (1968)  in  a  17-pagc  paper  criticize  Kersta's  technique 
as  being  more  of  an  art  than  a  science.  They  list  evidence  that  at  times 
the  spectrograms  that  Kersta  uses  for  identification  are  readily  con¬ 
fused  with  spectrograms  of  different  people  that  Ladefoged  and  Vander- 
slice  have  obtained.  Young  and  Cambell  (1967)  trained  observers  to 
identify  speakers  from  monosyllables  by  Kersta's  method  of  visually 
comparing  .spectrograms.  The  training  and  test  words  were  spoken  in 
different  contexts.  Rather  poor  recognition  rates  (78%  for  words  in 
the  same  context,  37.  3%  for  words  in  different  context)  were  reported. 
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They  listed  the  techniques  which  were  employed  and  possible  reasons 
why  their  recognition  rates  were  much  poorer  than  Kersta's. 

Pruzansky  (1963)  reported  on  an  automatic  speaker  recognition 
system  which  used  energy-time -frequency  patterns.  Seven  band-pass 
filter  s  were  used  to  obtain  the  frequency  components  of  the  measurement 
space.  The  time  components  were  obtained  by  sampling  the  output  of 
each  filter  at  10  msec  intervals.  Seven  male  and  three  female  speakers 
repeated  the  required  speech  four  times.  Ten  words  were  excerpted 
for  analysis.  Three  utterances  of  each  word  by  each  speaker  were  used 
to  form  a  reference  vector  for  each  word  and  each  speaker.  These 
three  utterances  plus  the  4*^  utterance  were  used  to  test  the  system. 

A  test  observation  was  classified  into  the  class  whose  reference  vector 
gave  maximum  correlation  with  the  test  vector.  A  recognition  rate  of 
89%  was  obtained.  Pruzansky  and  Matthews  (1964)  investigated  a  method 
for  reducing  the  number  of  features  used  to  recognize  the  speakers. 
Energy-time-frequency  patterns  were  again  used.  This  time  7  utterances 
were  taken.  The  first  3  utterances  were  used  to  form  the  reference 
patterns  and  the  last  4  utterances  were  used  as  test  observations.  A  test 
observation  was  classified  into  the  class  whose  reference  vector  was 
closest  to  the  test  vector.  A  recognition  rate  of  approximately  90%  was 
obtained.  Recognition  rates  of  90%  were  also  achieved  by  Hargreaves 
and  Starkweather  (1963). 

Li,  Dammann,  and  Chapman  (1966)  reported  on  an  automatic 
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speaker  \crification  system.  Using  one  main  speaker  and  10  impostors 
they  received  recognition  rates  from  80%  to  over  90%  depending  on  ♦he 

phrase  which  was  used.  For  the  phrase  nMy  name  is _ "  they 

received  a  recognition  rate  of  approximately  85%. 

Glenn  and  Kleiner  (1968)  used  power  spectra  which  were  produced 
during  nasal  phonations  to  recognize  speakers.  A  total  of  20  male  and 
10  female  speakers  were  used.  Ten  occurrences  were  used  to  form  a 
reference  vector.  Ten  occurrences  of  /n/  from  a  different  list  were 
used  to  test  the  system.  A  test  observation  was  classified  into  the 
class  whose  reference  vector  gave  maximum  correlation  with  the  test 
vector.  The  recognition  rate  increased  proportionally  to  the  number 
of  observations  which  were  used  to  find  the  reference  vector.  For  one 
observation  used  as  the  reference  vector  the  recognition  rate  was  43%; 
for  two  observations  used  to  calculate  the  reference  vector,  the  recog¬ 
nition  rate  was  68%;  for  5  observations  used  to  calculate  the  reference 
vector,  the  recognition  rate  was  82%;  and  for  10  observations  used  to 
calculate  the  reference  vector,  the  recognition  rate  was  93%. 

Das  (1969)  reported  on  a  speaker  verification  system  which 
employed  6  main  speakers  and  13  impostors.  Fifty  training  observations 
were  taken  from  each  main  speaker.  Ninety  observations  were  used  to 
test  each  main  speaker  and  30  disc rvations  were  used  to  test  each 
impostor.  Approximately  1600  measurements  were  obtained  from  each 
speech  segment.  The  number  of  measurements  (dimension  of  he 
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measurement  space)  was  reduced  to  200  through  the  use  of  analysis  of 
variance.  The  recognition  rates  ranged  from  91.4%  to  98.  6%  with  the 
average  recognition  rate  being  95.4%. 

Speaker  recognition  studies  have  not  been  limited  to  the  speakers 
of  English.  Solzhenitsyn  in  The  First  Circle  mentions  work  in  this 
area  during  the  Stalin  Era.  In  more  recent  times  Ramishvilli  (1966) 
reports  on  an  automatic  speaker  recognition  system  which  achieves  a 
recognition  rate  greater  than  90%. 

It  should  be  noted  that  in  most  of  these  experiments  that  some 
phase  of  the  process  was  done  by  the  human.  For  example  in  many 
cases  the  words  or  phonemes  were  excerpted  manually.  The  speaker 
verification  experiment  which  follows  has  the  advantage  that  it  can  be 
completely  automated. 

4.  3.  Experimental  Setup 

The  experimental  work  for  this  thesis  was  done  on  equipment 
at  the  Applied  Research  Laboratory  (ARL),  Sylvania  Electronic  Systems, 
Waltham,  Massachusetts.  The  equipment  for  the  preliminary  study  on 
phonemes  for  speaker  recognition  and  the  data  for  the  speaker  verifica¬ 
tion  project  were  graciously  furnished  by  Dr.  James  E.  Luck  of  that 
laboratory.  The  equipment  consists  of  a  computer  program  (see  Luck 
(1968a),  (1968b)),  a  Control  Data  CDC  3200  computer,  a  Texas  Instruments 
846  11  bit  A/D  converter,  a  remote  control  unit,  and  other  associated 
equipment. 
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Two  speaker  recognition  experiments,  a  phoneme  extraction 
experiment  and  a  speaker  verification  experiment,  were  done.  The 
speech  data  for  the  phoneme  experiment  was  taken  in  a  soundproof  room 
at  the  Audio-Visual  Center  at  Yale  University  on  Axnpex  professional 
equipment.  It  was  then  played  back  at  ARL  on  an  Ampex  PR-10  tape 
recorder  and  an  EDIT  computer  program,  Luck  (1968a),  was  used  to 
record  the  data  on  digital  tape. 

The  speaker  verification  data  was  taken  at  ARL  from  a  micro¬ 
phone  in  a  soundproof  room.  The  speech  was  band  limited,  A/D 
converted  and  stored  on  digital  tape.  The  band-pass  filter  was  flat  to 
3kIIz  and  down  25  db  at  4kHz.  The  A/D  conversion  rate  was  8000 
samples  per  second  at  10  bit  accuracy.  The  data  was  recorded  on 
digital  tape  in  2000-24  bit  word  records.  Each  computer  word  contained 
two  12  bit  samples.  The  digital  tapes  were  converted  at  the  Yale 
University  Computer  Center  for  use  on  the  IBM  7040-7094  DCS  system. 

To  be  able  to  use  the  DFTR  classification  procedures,  the 
utterances  must  be  independent.  This  was  insured  by  Dr.  Luck's  data 
gathering  procedure.  A  command  from  the  teletype  instructs  the  com¬ 
puter  to  accept  data.  At  the  same  time  an  indicator  light  tells  the 
speaker  to  repeat  the  test  sentence.  Then  the  speaker  says  the  sentence 
and  the  computer  processes  the  data.  There  is  a  one-minute  delay 
before  the  indicator  light  requests  anothe r  utte ranc e . 

Dr.  Luck's  system  for  recording  the  data  is  such  that  the  samples 
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from  the  A/D  converter  are  temporarily  stored  from  the  moment  that 
the  indicator  light  is  turned  on.  When  the  amplitude  of  a  sample  exceeds 
a  certain  threshold,  the  data  which  follows  that  sample  along  with  the 
data  just  prior  to  that  sample  are  stored.  A  total  of  8000  samples  are 
stored  per  utterance. 

4.4.  Preliminary  Study ;  Speaker  Recognition  by  Analysis  of  Phoneme s 

As  a  first  step  into  speaker  recognition  we  decided  to  study  the 
use  of  phonemes  for  the  recognition  of  speakers.  The  phonemes  to  be 
employed  were  those  that  required  different  lip,  jaw,  and  tongue  positions 
and  various  sections  of  the  nasal  and  oral  cavities.  It  was  hoped  that 
the  different  physical  constraints  placed  upon  the  speech  by  the  different 
speakers  would  yield  the  speaker's  identity. 

Fifteen  adult  males  were  recorded  while  repeating  the  words  of 
Table  4.  1.  They  were  requested  to  speak  normally  and  to  pause  between 
words.  It  should  be  noted  that  the  words  were  not  excerpted  horn  con¬ 
nected  speech. 

The  words  which  contained  the  vowels  were  chosen  because  they 
exhibit  the  various  tongue  positions.  Consider  the  words  "beat,  bit, 
bait,  bet,  bat."  The  tongue  is  positioned  toward  the  front  of  the  mouth 
for  the  vowels  in  these  words.  For  the  "ea"  of  "beat"  the  vertical 
position  of  the  tongue  is  high  in  the  mouth.  The  vertical  position  of  the 

tongue  is  lower  for  the  vowel  in  "bit,  "  lower  still  for  the  vowel  in  bait" 
and  lowest  for  the  vowel  in  "bat.  "  The  horizontal  position  of  the  tongue 


A-105 


Table  4. 1.  Text  for  Phoneme  Analysis 


Vowels 

beat 

bit 

bait 

be_t 

bat 

but 

do 

foot 

dough 

bought 

dot 

F  ricative  s 
Huff 

vc  rve 

sauce 

zoos 

mesh 

moctsu  rt 

thin 

then 


Stops 

E°£ 

bob 

tat_ 

ciad 

kick 

Sono rants 

mum 

none 

iun£ 

lull 

Dipthongs 

my 

how 

toy 

amuse 
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during  the  vowels  in  "do,  foot,  dough,  bought,  dot"  is  in  the  back  of  the 
mouth.  The  vertical  position  of  the  tongue  ranges  from  high  for  the 
vowel  in  "do"  to  low  for  the  vowel  in  "dot.  " 

Words  using  the  consonants  were  included  to  see  if  the  informa¬ 
tion  they  exhibit  for  speaker  recog  ition  is  sufficient  to  justify  the 
higher  bandwidth  necessary  to  accommodate  them.  Diphthongs  were 
included  so  that  the  information  about  the  identity  of  the  speakers  in  the 
transition  between  phonemes  could  be  investigaged. 

The  first  experiment  was  a  spectral  comparison  of  the  11  vowels 
listed  in  Table  4.1  for  the  different  speakers.  Each  of  the  15  speakers 
said  each  of  the  11  words  containing  the  vowels.  Each  word  was  said 
once  and  contributes  an  utte r since.  Thus  there  were  165  utterances  for 
analysis. 

The  speech  signals  were  filtered  to  pass  frequencies  up  to  4kKr 
and  digitized  at  10,000  samples  per  second.  The  vowels  were  isolated 
by  means  of  the  digital  computer  program  EDIT,  Luck  (1968a).  This 
program  enables  one  to  examine  digital  signals  in  great  detail.  Segm=r.:s 
of  the  speech  can  be  heard  by  the  researcher  and  simultaneously  observed 
on  an  oscilloscope.  The  speech  window  can  be  lengthened  or  shortened 
as  desired  and  the  speech  which  is  observed  in  the  window  can  then  be 
transferred  to  digital  tape. 

The  Fast  Fourier  Transform,  Cooley  and  Tukey  (1965),  Cochran 
et.  al.  (1967),  was  used  to  obtain  the  spectral  components  of  the  speech. 
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F our  short-term  spectra  of  each  vowel  were  calculated  and  plotted  by 


the  computer.  The  first  spectrum  was  calculated  from  the  first  256 
samples  of  the  isolated  vowel.  The  second  spectrum  was  calculated 
from  the  next  256  samples,  the  third  spectrum  from  the  next  256  sample 
and  the  fourth  spectrum  from  the  next  2  56  samples.  The  total  energy 

in  each  spectrum  was  made  the  same.  From  inputs  x  and  y  , 

t  t 

1  <  t  <  2  56  ,  the  FFT  program  computes  a^  and  where 


256 

lr+  jbf  =  I  <Y  V 

t=l 


.  2tt  tf 


-J 


2  56 


f=l,  . . . ,256 


(4.1) 


Note  that  is  the  amplitude  of  the  speech  signal  at  time  t  and  y^  , 
the  imaginary  part,  is  zero  for  all  t  .  Since  the  imaginary  part  is 
zero,  there  is  symmetry  in  a^  and  b^  about  the  midfrequency.  Hence, 
only  128  unique  a^  and  128  unique  b^.  result.  Let  c^  be  defined  by 

Cf  =  */  af  +  bf  f=l,  ...»  128  .  (4.2) 

The  qualitative  result  from  the  experiment  was  that  the  difference 
in  spectra  for  two  different  speakers  saying  the  same  vowel  was  not 
much  greater  than  the  difference  in  spectra  for  the  same  speaker  taken 
at  different  times  during  the  utterance.  This  was  concluded  from 
visually  observing  the  spectra  and  by  deterministically  comparing  the 
spectra.  The  spectra  were  deterministically  compared  as  follows. 

Lot  the  Cj,  f=l ,  .  .  .  ,  128  ,  be  the  coordinates  of  a  measurement  space. 
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Then  each  spectrum  can  be  represented  as  a  vector  in  this  128-cimen- 
sional  measurement  space.  The  distance  was  calculated  between  the 
various  spectra.  In  all  cases  the  closest  spectrum  to  any  given  spectrum 
was  one  from  the  same  utterance.  That  is,  the  closest  spectrum  to  the 
third  spectrum  of  one  speaker  was  either  the  first,  second,  or  fourth 
spectrum  of  the  speaker.  However,  in  many  cases  the  distance  between 
the  first  and  the  fourth  spectrum  was  greater  than  the  distance  between 
spectra  of  different  persons. 

This  experiment  was  conducted  because  in  most  automatic 
speaker  recognition  systems  the  same  point  in  each  phoneme  is  not 

J. 

located  each  time  that  the  phoneme  is  uttered.  The  purpose  of  this 
test  was  to  determine  how  this  variation  in  locating  the  phoneme  affected 
speaker  recognition.  Since  the  experiment  was  not  very  encouraging 
and  since  the  EDIT  program  required  considerable  computer  time  for 
the  isolation  of  the  phonemes,  we  decided  to  use  the  information  in  the 
transitions  between  phonemes  in  addition  to  the  short-term  spectra  of 
the  phonemes  for  the  speaker  verification  experiment. 


*  Note  that  the  phonemes  here  were  not  extracted  automatically  but 
rather  they  were  extracted  manually  through  the  EDIT  computer  program. 
In  the  usual  automatic  speaker  recognition  system  the  phoneme  is  iso¬ 
lated  by  an  approximate  method.  For  example,  many  times  a  vowel  is 
located  by  finding  the  point  of  largest  amplitude  in  speech  which  has  been 
low-pass  filtered.  Hence  in  many  trials  one  is  likely  to  isolate  many 
different  segments  of  the  phoneme. 
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4.  5.  Measurements  for  the  Speaker  Verification  Experiment 

It  is  hypothesized  that  different  speakers  differ  in  some  of  the 
following  aspects:  the  size  and  shape  of  the  nasal  and  oral  cavities, 
the  placement  of  the  teeth,  the  tongue  mass,  and  the  manner  in  which 
the  tongue,  lips,  and  jaws  are  usee  in  speaking.  It  is  assumed  that 
much  of  the  information  for  the  recognition  of  speakers  is  contained  in 
the  transition  between  phonemes  as  well  as  in  the  phonemes  themselves. 
For  this  reason,  a  simple  word  which  contained  a  diphthong  was  analyzed 
by  calculating  many  short-term  spectra  over  the  length  of  the  word. 
.These  spectra  were  used  to  form  the  measurement  space  in  which  the 
decision  regions  were  constructed. 

Recordings  of  225  utterances  by  each  of  three  speakers  and  25 
utterances  by  each  of  26  impostors  were  furnished  by  Dr.  James  Luck. 
These  utterances  consisted  of  the  sentence  "My  code  is  (and  then  the 
speaker's  initials)"  digitized  into  10  bit  accuracy  at  8000  samples  per 
second.  Only  the  word  "my"  was  used  in  this  analysis. 

The  word  "my"  was  considered  a  good  word  for  the  speaker 
recognition  experiment  for  the  following  reasons.  First,  the  nasal  /m/ 
is  thought  to  give  good  measurements  for  speaker  identification  because 
of  the  relatively  fixed  influence  of  the  nasal  cavity.  See  Glenn  and 
Kleiner  (1968)  or  V/olf  (1969).  Also,  the  word  "my"  contains  a  diphthong 
( /  <l1  / )  and  is  therefore  good  for  testing  the  usefulness  for  speaker  identi¬ 
fication  purposes  of  the  information  in  the  transitions  between  phonemes. 
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Furthermore,  the  word  "my"  is  a  good  word  for  our  purposes  because 
very  little  information  is  lost  from  the  word  when  it  is  filtered  at  3.4 
kHz.  Notice  that  the  stop  constant  "c"  in  "code"  allows  the  word  "my" 
to  be  isolated  by  simple  amplitude  detection.  Therefore,  the  recognition 
system  is  easily  automated. 

To  extract  the  speaker  identity  information  in  the  transition 
between  phonemes,  many  short-term  spectra  were  calculated  over  the 
duration  of  "my."  The  questions  to  be  answered  were:  (1)  How  many 
spectra  should  be  used?  (2)  What  should  be  the  frequency  resolution 
of  each  spectrum? 

These  questions  were  answered  by  utilizing  three  different  2  56- 
dimensional  measurement  spaces.  The  first  measurement  space  was 
made  up  of  four  spectra  with  each  spectrum  having  64  frequency  com¬ 
ponents.  The  second  space  consisted  of  eight  spectra  with  each  spec¬ 
trum  having  32  frequency  components.  The  third  space  consisted  of 
16  spectra  with  each  spectrum  having  16  frequency  components.  Later 
a  reduced  space  consisting  of  six  spectra  with  each  spectrum  having 
eight  frequency  components  was  used. 

Now  consider  in  detail  the  procedure  for  obtaining  the  measure¬ 
ments  which  were  used  in  the  experiments.  A  chart  outlining  the 
procedure  is  shown  in  Figure  4.  1.  The  data  from  Dr.  Luck  was  stored 
on  digital  tape  in  2000-24  bit  words  per  record.  Each  word  contained 
two  12  bit  samples  of  the  speech.  The  data  was  blocked  at  Yale  University 
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Read  New  Record 


Figure  4.1.  Chart  for  Obtaining  Measurements. 
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into  460  words  per  record  so  that  it  could  be  read  on  the  7040-7094 
DCS  system.  The  samples  were  then  unpacked  and  placed  in  a  36  bit 
word. 

The  beginning  and  ending  of  the  word  "my"  were  located  by 
amplitude  detection.  The  thresholds  for  the  detection  were  obtained 
from  a  preliminary  study  on  the  first  five  utterances  by  the  first  three 
speakers  of  the  impostor  data.  See  Table  4.  2.  It  should  be  noted  that 
some  difficulty  was  incurred  in  finding  the  beginning  of  "my"  for  the 
second  speaker.  This  was  due  to  a  low  amplitude  guttural  sound  produced 
'by  this  speaker.  Even  by  inspection  it  was  rather  difficult  to  decide 
where  the  /m /  begins. 

To  ascertain  that  the  word  "my"  was  detected,  a  minimum 
acceptable  word  length  and  an  acceptable  word  position  in  the  record 
was  established.  If  the  length  of  the  word  was  too  short,  a  search  was 
made  for  a  longer  word.  The  occurrence  of  a  short  word  was  sometimes 
due  to  an  extraneous  noise  made  during  the  recording.  The  computer 
was  programmed  so  that  if  a  word  of  acceptable  length  was  found  but  me 
word  was  cut  off  by  the  end  of  the  record,  the  computer  rejected  this 
.  word.  It  then  proceeded  to  read  another  utterance  from  the  input  type. 

Location  of  the  Intervals 

Figure  4.2  shows  four  typical  amplitude  versus  time  waveforms 
of  the  filtered  speech  "My  code.  "  The  top  waveform  includes  the 
beginning  of  the  /d/  in  "code.  "  The  bottom  wave  form  contains  none  cf 
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Table  4.2.  Impostor  Data 


Speaker  Speaker 

Number  Name  Date  Uttered  Number  Name  Date  Uttered 


1 

R.  F  reudberg 

7-30-68 

15 

G.  Briskman 

7-3-68 

2 

J.  Luck 

8-6-68 

16 

G.  Bethoney 

7-3-68 

3 

J.  DeLellis 

7-3-68 

17 

C.  Mariano 

7-3-68 

4 

H.  Shaffer 

7-3-68 

18 

R.  Pike 

7-3-68 

5 

C.  Howard 

7-3-68 

19 

J.  Stoddard 

7-3-68 

6 

H.  Manley 

7-10-68 

20 

D.  Kinsley 

7-3-68 

7 

K.  Lang 

7-3^68 

21 

R.  Hasselboum  7- 3-61 

8 

L.  Abraham 

7-3-68 

22 

T.  MacDonald 

7-3-68 

9 

G.  Cummings 

7-3-68 

23 

J.  Waggett 

7-3-68 

10 

F.  Cassidy 

7-3-68 

24 

S.  Free 

7-3-68 

1 1 

R.  Lucy 

7-3-68 

25 

G.  Beakley 

10-16-68 

12 

A.  Levesque 

7-3-68 

26 

W.  Wright 

10-16-68 

13 

J.  Boucher 

7-3-68 

27 

B.  Fitzgerald 

10-16-68 

14 

H.  Halewisn 

7-3-68 

Table  4.3. 

Main  Speaker  —  RF 

-Data 

Sitting  Number 

Date 

Time 

1 

7-3-68 

1 1 :30  a.  m. 

2 

7-3-68 

12:15  a.  m. 

3  , 

7-3-68 

1 : 3 5  p .  m. 

4 

7-10-68 

3:15p.m. 

5 

7-11-68 

9:15a.m. 

6 

7-16-68 

1  0:00  a.  m. 

7 

7-18-68 

? 

8  ' 

7-25-68 

1  1 : 00  a.  m. 
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Figure  4.2.  Four  Utterances  of  nKy  code...'1.- 
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the  /d/.  Notice  that  the  /c/  has  been  filtered  and  is  barely  visible  in  the 
figure. 

Above  each  waveform  is  a  set  of  numbered  intervals.  These 
intervals  show  where  the  spectra  are  calculated  for  the  four  different 
measurement  spaces  which  were  used  in  the  experiments.  The  intervals 
above  the  top  three  waveforms  show  the  regions  analyzed  for  the  three 
different  256  dimensional  measurement  spaces.  The  information  in  the 
regions  outside  these  intervals  is  not  used.  Notice  that  there  is  con¬ 
siderable  overlap  of  the  intervals  when  16  spectra  are  calculated. 

Each  interval  is  of  a  fixed  duration,  256  time  samples  or 
approximately  32  milliseconds.  The  intervals  are  placed  uniformly 
across  "my."  The  first  interval  is  placed  at  the  "beginning"  of  the  word 
and  the  last  interval  is  placed  at  the  "ending"  of  the  word. 

A  spectrum  is  calculated  from  the  256  time  samples  in  each 

interval  by  means  of  the  Fast  Fourier  Transform  (FFT).  This  means 

that  cqxiations  4.  1  and  4.  2  are  used  to  obtain  c  ,  .  .  .  ,  c  for  each 

1  ldo 

interval.  Now  let 


L 

M 


f'=M*(f'-l)  +1 


f'  =  1,  .  .  .,128/M  (4.  3 


where  M  is  a  positive  integer.  The  d^,,  are  used  to  form  the  coordinate 
of  the  measurement  space.  For  example,  let  us  consider  the  first  meas¬ 
urement  space  where  four  spectra  are  used, with  each  spectrum  having 
64  frequency  components.  Here  the  integer  M  of  equation  4.3  is  equal 
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to  2  .  This  gives  64  frequency  components,  ,  for  each 

spectrum.  Let  us  denote  the  measurements  from  all  4  spectra  by 
d  ,  .  . .  one  these  measurements  is  used  as  a  coordinate 

of  the  first  measurement  space.  Then  each  utterance  of  the  word  "my" 
is  represented  by  a  vector  in  this  space.  In  these  experiments  all  vectors 
are  normalized  to  have  the  same  length.  This  is  done  to  eliminate  the 
variation  in  amplitude  of  the  speech. 

For  the  second  and  third  measurement  spaces,  the  integer  M  in 
equation  4.  3  was  set  equal  to  4  and  8  ,  respectively.  Since  8  spectra 
are  calculated  for  the  second  measurement  space  and  16  for  the  third 
space,  all  three  spaces  have  256  dimensions.  Note  that  the  first  space 
considers  spectral  detail  more  important  than  spectral  variation  in  time. 
The  third  space  does  the  reverse.  That  is,  it  considers  spectral  variation 
in  time  more  important  than  spectral  detail.  One  object  of  the  experiment 
is  to  determine  the  relative  importance  of  these  factors  for  the  recogni¬ 


tion  of  speakers. 


4,6.  Data 

Table  4.2  shows  the  27  speakers  who  were  recorded  on  the 
impostor  tape  and  the  date  of  the  recordings.  Each  speaker  recorded  2  5 

utterances  of  "My  code  is _ "  at  one  sitting.  The  first  8  utterances 

were  used  as  training  observations.  The  last  17  utterances  were  used  as 
test  observations.  The  number  8  was  chosen  primarily  with  the  results 
of  Glenn  and  Kleiner  (1968)  in  mind.  It  was  desirable  to  use  a  sufficient 
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number  of  training  observations  to  get  a  good  representation  of  the 
speaker  but  nevertheless  to  have  enough  unused  observations  to  exten¬ 
sively  test  the  verification  system. 

The  computer  program  was  unable  to  locate  the  word  "my"  in 
3  out  of  the  675  utterances  on  the  impostor  tape.  In  the  training  phase 
this  happened  for  the  2nc*  utterance  of  the  24^  speaker.  Hence,  the  9^ 
utterance  of  the  24^  speaker  was  used  in  its  place  as  a  training  observa¬ 
tion.  A  total  of  8*26  =  208  impostor  observations  were  used  in  the 
training  phase.  The  computer  program  was  unable  to  locate  "my"  for 
the  21s*-  utterance  by  the  4*h  speaker  and  the  21s^  utterance  by  the  9^ 
speaker.  Hence  a  total  of  17*26  -  3  =  439  impostor  observations  were 
used  in  the  test  phase. 

Table  4.  3  shows  the  main  speaker  data  for  R.  Freudberg  (RF). 
Twenty-five  utterances  by  RF  were  recorded  at  each  sitting.  The  first 
8  utterances  at  the  first  5  sittings  were  used  as  main  speaker  training 
observations.  The  computer  program  was  able  to  locate  the  word  "my" 
for  all  RF  utterances.  Therefore  5*8  =  40  main  speaker  training  observa- 

•-*4 

tions  and  185  main  speaker  test  observations  were  used.  Note  there  are 
25  main  speaker  test  observations  listed  with  the  impostor  data. 

4.7.  Decision  Regions 

The  decision  procedures  which  are  used  in  the  speaker  verifica¬ 
tion  experiment  are  summarized  in  this  section.  They  were  completely 
described  in  Chapter  3.  Let  class  1  be  the  class  of  the  impostors  and 
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class  2  be  the  class  of  the  main  speaker.  If  a  test  observation  falls  in 
region  R.,  it  is  classified  as  the  main  speaker.  Otherwise  it  is  classi¬ 
fied  as  an  impostor. 

(1)  AHE  (All  Hyperspheres  Expand)  —  All  hyperspheres  expand  until 

n^  +  1  blocks  have  been  found.  The  first  m  of  these  blocks  are  used  to 
form  region  R^. 

(2)  OHC  (Ordered  Hyperspheres  Constant)  —  All  hyperspheres  expand 
until  the  first  block  has  been  formed  (a  class  1  observation  has  been 
ordered).  All  hyperspheres  except  the  one  which  ordered  the  class  1 
observation  expand  until  another  block  is  formed.  This  procedure  is 
continued  with  each  hypersphere  stopping  after  it  orders  a  class  1 
observation. 

(3)  CHS  (Conditioned  Hyper  spheres  Stop)  —  The  hyper  spheres  which 
order  the  observations  stop  as  they  did  in  the  OHC  procedure.  However, 
the  procedures  differ  when  an  expanding  hypersphere  intersects  a  class 

1  observation  which  has  already  been  ordered  by  another  hypersphere. 

In  the  CHS  procedure  the  hypersphere  Stops  even  though  the  block  has  not 
been  completed. 

The  number  of  blocks,  m  ,  which  are  used  to  form  the  region 
R^  should  be  chosen  so  that  the  classifier  will  produce  the  desired  false 
alarm  rate.  In  this  experiment  there  was  also  another  objective.  There 
should  be  a  sufficient  number  of  blocks  in  R^  so  that  the  differences  in 
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the  above  ordering  procedures  become  evident.  The  number  m  =  7  was 
thought  to  be  a  good  compromise  between  a  number  small  enough  for  a 
respectable  false  alarm  rate  and  a  number  large  enough  to  exhibit  differ¬ 
ences  in  the  ordering  procedures.  For  m  =  7  and  n^=  208  the  following 

parameters  are  obtained  for  the  classifier.  P  is  the  false  alarm 

F  A 

probability,  Pr(*  )  is  the  probability  of  (• )  ,  E(*)  is  the  expected  value 

of  (* )  »  CT ( *  )  is  the  standard  deviation  of  (* )  . 


Pr<PFA-  -°55) 

=  .95 

E  (PFA> 

=  .  0335 

0  (PFA> 

=  .0124 

Table  4.4  shows  the  results  which  were  obtained  when  the  three 
DFTR  procedures  were  trained  and  tested  in  the  three  different  256 
dimensional  spaces.  The  test  false  alarm  rate,  the  test  miss  rate,  and 
the  total  error  rate  are  listed  for  each  procedure.  For  example,  for 
the  AHE  procedure  in  measurement  space  1,15  out  of  439  impostor 
test  observations  were  classified  as  the  main  speaker  and  7  out  of  185 
main  speaker  test  observations  were  classified  as  an  impostor.  This 
gives  a  total  error  rate  of  22/624  =  .  0352.  The  average  error  rate 
obtained  for  all  three  ordering  procedures  in  measurement  space  1  was 
.  0347.  This  compared  with  an  average  error  rate  of  .0251  for  measure¬ 
ment  space  2  and  an  average  error  rate  of  .  0336  for  measurement  spare 
3.  Along  with  having  the  lowest  average  error  rate,  measurement  space 
2  also  had  the  lowest  error  rate  for  each  ordering  procedure.  It  was 
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Table  4.4.  Three  Different  256-dimensional  Measurement  Spaces 


Training  Data 


40  MainSpeaker  Training  Samples  208  Impostor  Training  Samples 


Pr  (P  ,  <  .  07)  =  .  99  Pr  (P^  A  <  .  055)  =  .  95 

FA  “  FA  — 

e(pfa)  =  ,°335 

CX(PFA)  =  *°124 


Test  Results 


1 85  MainSpeaker  Test  Samples 

439  Impostor  Test  Samples 

Measurement  Space  1-4  spectra,  64  components/spectrum 

Procedure 

False  Alarm 

Miss 

Total  Error 

(1)  AHE 

15/439  =  . 0342 

7/185  =  . 0379 

.  0352 

(2)  OHC 

15/439  =  .0342 

7/185  =  .0379 

.  0352 

(3)  CHS 

14/439  =  .  0319 

7/185  =  .0379 

.0337 

Measurement  Space  2-8  spectra,  32  components /spectrum 

Procedure 

False  Alarm 

Miss 

Total  Error 

(1)  AHE 

12/439  =  .  0274 

7/185  =  .0379 

.  0305 

(2)  OHC 

10/439  =  .  0228 

6/185  =  .0324 

.  0256 

(3)  CHS 

9/439  =  .  0205 

3/185  =  .0162 

.  0192 

Measurement  Space  3-16  spectra,  16  components /spectrum 

Procedure 

F alse  Alarm 

Miss 

Total  Error 

(1)  AHE 

7/439  =  .0159 

26/185  =  . 1405 

.  0529 

(2)  OHC 

4/439  =  .  0091 

14/185  =  . 0756 

.0288 

(3)  CHS 

8/439  =  .  0181 

4/185  =  .0216 

.0192 
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therefore  judged  to  be  the  best  space.  One  may  speculate  why  this  is 
true.  First,  the  intervals  in  which  the  8  spectra  are  calculated  usually 
cover  the  entire  utterance  of  "my"  for  the  main  speaker  and  for  most  of 
the  impostors.  See  Figure  4.2.  Therefore  much  of  the  information  for 
the  recognition  of  the  main  speaker  and  most  of  the  impostors  should  be 
contained  in  this  space.  One  can  argue  that  the  space  of  16  spectra  with 
16  components  per  spectrum  emphasized  the  time  element  too  strongly. 
Hence  an  impostor  who  changes  phonemes  at  the  same  rate  as  the  main 
speaker  may  be  easily  classified  as  the  main  speaker.  (This  seems  to 
be  the  case  with  Impostor  6  as  will  be  discussed  later.  )  Also  one  may 
expect  the  "time  features"  to  vary  more  from  sitting  to  sitting  than  the 
"frequency  features.  "  Hence  one  might  expect  a  high  miss  rate  in 
measurement  space  3  since  the  main  speaker  was  recorded  at  9  different 
sittings  over  a  period  of  a  month.  This  high  miss  rate  is  seen  in  Table 
4.4  for  the  AHE  and  OHC  procedures.  Measurement  space  1  does  not 
perform  as  well  as  the  other  two  spaces  because  (1)  less  information  is 
extracted  from  the  speech  waveform  ,  (2)  a  large  variation  in  the  place¬ 
ment  of  the  second  and  third  spectra  can  occur  over  different  utterances 
and  (3)  only  two  adjacent  frequency  components  were  averaged,  thus 
possibly  resulting  in  excessive  spectral  fluctuations. 

Now  compare  the  three  ordering  procedures.  Very  little  difference 
is  noted  in  the  total  error  rates  for  the  three  procedures  in  measurement 
space  1.  In  the  other  two  spaces  the  CHS  procedure  and  the  OHC  procedure 
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easily  outperform  the  AHE  procedure.  This  might  have  beeir expected 
since  the  CHS  and  the  OHC  procedures  use  the  information  that  becomes 
available  during  the  ordering  to  select  subsequent  ordering  functions. 

Hence  the  region  for  the-se  procedures  is  somewhat  shaped  by  the 
observations. 

The  average  error  rate  for  the  AHE  ordering  procedure  in  the 
three  spaces  is  .  0395.  This  compares  with  an  average  error  rate  of 
.0299  for  the  OHC  procedure  and  an  average  error  rate  of  .  0240  for  the 
CHS  procedure. 

A  detailed  study  was  made  of  the  training  and  test  data  in  the 
three  spaces.  Table  4.  5  shows  the  errors  made  by  the  particular  speakers 
in  these  spaces.  The  errors  made  by  the  main  speaker  are  listed  by  the 
number  of  the  sitting  at  which  the  speech  was  recorded.  MS9  is  the  sitting 
of  the  main  speaker  which  was  recorded  on  the  impostor  tape.  The  main 
results  of  this  study  are: 

(1)  There  is  a  definite  correlation  between  the  impostors  which 
are  closest  in  Euclidean  distance  to  the  main  speaker  in  the  training 
phase  (this  was  determined  from  the  ordering)  and  the  impostors  which 
made  the  errors  in  the  test  phase.  This  indicates  that  8  utterances  are 
probably  sufficient  to  represent  a  speaker  at  one  sitting.  For  example, 
the  four  closest  impostors  in  the  training  phase  in  measurement  space  1 
were  impostors  16,  13,  12,  and  15,  in  that  order.  The  impostors  who 
made  the  most  errors  during  the  test  phase  were  impostors  12,  16,  10, 
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Tabic  4.  5.  Test  Errors  in  3  Different  256-dimensional 

* 

Measurement  Spaces 


Space  1 


Space  2 


Space  3 


4  spectra 
64  components 


8  spectra 
32  components 


16  spectra 
16  components 


IMPOSTORS 


AHE  OHC  CHS  AHE  OHC  CHS  AHE  OHC  CHS 


IM  2  1  _  _  _  _  . 

IM6  1-  -11  -  545 

IM  9  -  -  -1.  2 

10  22  2-  -  2--1 
11  1  1  -  -  1 

12  4  4  4  1  1  1  -  -  - 

13  2  1  1  4  2  2  -  -  1 

15  1  -  -  -  -  1 

16  22  246  2-  -1 

17  1  1  . 

18  111  . 

21  -11-----. 

22  1  2  2  ------ 


Main  Speakers 

MSI  ------  1  -  - 

2  -----1221 

3  11  1  2  2  -  2 

5  ------111 

6  3  3  3  '  1  1  -  3  2  - 

7  -  -  -  1  1  -  1  -  - 

8  2  1  1  4  2  - 

9  3  3  3  1  1  1  12  7  2 
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22,  and  13.  The  four  closest  Impostors  in  the  second  space  were  impostors 
« 

13,  16,  18,  and  12.  The  most  errors  were  made  by  impostors  16,  13, 
and  12.  The  four  closest  impostors  in  the  3rd  space  were  impostors  6,  16, 
13,  and  9.  Impostor  6  made  the  most  errors  in  that  space. 

(2)  The  impostors  which  were  relatively  good  substitutes  for  the 
main  speaker  in  one  space  were  not  necessarily  good  substitutes  for  the 
main  speaker  in  one  of  the  other  spaces.  For  example,  the  observations 
of  impostor  12  were  near  the  observations  of  the  main  speaker  in  measure¬ 
ment  space  1  but  were  not  in  measurement  space  3. 

.  » 

(3)  There  was  a  tendency  for  main  speaker  observations  of  the 

same  sitting  to  cluster  more  than  main  speaker  observations  of  different 
sittings.  This  was  manifest  in  smaller  error  rates  for  MSI,  MS2,  MS3, 
MS4,  and  MS5  than  for  MS6,  MS7,  MS8,  and  MS9,  a  situation  which 
occurred  for  all  cases  except  the  CHS  procedure  in  measurement  space 
3.  Furthermore,  the  clbsest  main  speaker  training  observation  to  a 
main  speaker  test  observation  from  MSI , .  .  .  ,MS5  was  usually  a  training 
observation  which  was  recorded  at  the  same  sitting.  This  occurred  for 
over  70%  of  the  test  observations  from  MSI , . . . ,  MS5. 

(4)  The  AHE  procedure  in  measurement  space  3  yielded  a  relatively 
poor  miss  rate.  This  was  because  the  training  observations  from  Impostor 
6  were  near  those  of  the  main  speaker.  In  the  training  phase  the  ordering 
hypersphercs  for  the  AHE  procedure  continued  to  expand  into  the  training 
observations  of  Impostor  6.  Hence  the  hyperspheres  were  stopped  before 
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they  became  large  enough  to  yield  a  good  mi$s  rate.  For  the  CHC  and 
CHS  procedures,  the  hyperspheres  which  were  expanding  into  the  training 
observations  of  Impostor  6  were  stopped,  allowing  the  remaining  hyper¬ 
spheres  to  become  much  larger  than  the  hyperspheres  for  the  AHE  pro¬ 
cedure,  Here  is  a  case  where  the  OHC  and  the  CHS  procedures  clearly 

✓ 

perform  better  than  the  AHE  procedure. 

4.8.  Reduced  Measurement  Space 

i 

In  this  section  the  hypersphere  DFTR  approaches  are  experimentally 
compared  with  other  classification  methods  using  two  different  main 
speakers.  Because  of  the  excessive  computing  time  for  these  experiments 
in  a  256-dimensional  measurement  space,  it  was  decided  to  reduce  the 
size  of  the  space.  For  example,  the  use  of  the  nearest  neighbor  rule  in 
a  2  56-dimensional  space  would  have  taken  approximately  10  hours  of 
computing  time.  This  is  partially  due  to  the  fact  that  the  pertinent  data 
for  the  nearest-neighbor  rule  would  not  fit  into  the  core  of  the  IBM  7094. 

The  space  which  gave  the  best  results  in  the  3  space  experiment 
was  that  space  consisting  of  8  spectra  with  32  components  per  spectrum. 

It  was  decided  that  approximate  coverage  of  the  word  "my"  with  spectra 
was  important.  A  compromise  of  6  spectra  placed  uniformly  across 
"my"  was  reached  for  the  reduced  dimensional  space.  The  number  of 
components  per  spectrum  was  reduced  to  8,  making  the  reduced  space  a 
48-dimcnsicnal  space. 

Table  4.6  shows  the  results  for  this  space  when  two  different  main 


A-126 


Tabic  4.6.  48-dimcnsional  Space,  Two  Main  Speakers 


6  spectra,  8  components /spectrum 


Decision 

Procedure 

RF  =  Main  Speaker 

JD  =  Main  Speaker 

F  alse 
Alarm 

Miss 

Total 

Error 

F  alse 
Alarm 

Miss 

Total 

Error 

Pr(PFA  -  •055)  =  *95 

E(pfa)  =  *0335 

(1)  AHE 

.0319 

.  2972 

.1105 

.0297 

.  3162 

.  0900 

(2)  OHC 

.0297 

.1837 

.  0754 

.0297 

.2991 

.0863 

(3)  CHS 

.0342 

.1243 

.  0609 

.0297 

.2991 

.  0863 

Pr(P  c  .085)  =  .95 

F  A  “ 

e(pfa}  =  -0574 

(1)  AHE 

.  0550 

.2000 

.0977 

.  0388 

.  1710 

.  0665 

(2)  OHC 

.  0455 

.  0650 

.0512 

.  0410 

.  1625 

.  0665 

(3)  CHS 

.0387 

.  0595 

.0448 

.  0434 

.  1368 

.  0629 

NN  Rule 

.0205 

.  0216 

.0209 

.  0369 

.1111 

.  0522 

5  MS,  26  IM 

.0228 

.  0216 

.  0224 

1  MS,  26  IM 

.0137 

.  0973 

.  0385 

1  MS,  1  IM 

.  1985 

.  0000 

.1395 

A-127 


speakers  were  used.  The  first  row  shows  the  DFTR  results  when  7  blocks 
were  used  to  form  R^.  The  next  row  shows  the  results  when  12  blocks 
were  used  to  form  R^.  The  results  for  the  different  DFTR  procedures 
varied  less  for  speaker  JD  than  for  speaker  RF.  The  average  error 
rate  over  all  three  DFTR  procedures  was  approximately  equal  for  each 
speaker.  The  average  error  for  the  7  block  DFTR  problem  was  .0823 
for  RF  and  .0875  for  JD  .  The  average  error  for  the  12  block  DFTR 
problem  was  .0646  for  RF  and  .  0653  for  JD  . 

The  nearest-neighbor  (NN)  rule  was  then  used  to  classify  the 

•  .c 

*1' 

observations.  The  nearest-neighbor  rule  performed  better  than  the 
best  DFTR  procedure  for  both  speakers.  This  might  have  been  expected 
from  the  nature  of  the  decision  rules.  Both  the  NN  and  the  DFTR  proce¬ 
dures  use  the  information  about  all  main  speaker  training  observations. 
However,  the  DFTR  procedures  use  less  information  about  the  impostor 
training  observations  than  the  NN  rule.  The  AHE-DFTR  procedure  uses 
only  the  information  about  the  mtn  impostor  training  observation  to  obtain 
the  decision  regions.  The  OHC  and  the  CHS  procedures  use  the  informa¬ 
tion  about  the  first  m  impostor  training  observations  to  obtain  the 
decision  regions.  The  NN  procedure,  however,  uses  information  about 
all  the  impostor  training  observations  to  obtain  the  decision  regions. 

This  allows  the  main  speaker  observations  to  fluctuate  more  in  certain 

*  See  Chapter  5  for  details  of  the  NN  rule. 
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directions  before  errors  are  made  than  the  DFTR  procedures  allow.  Other 
differences  in  the  decision  rules  may  tend  to  cancel.  This,  of  course, 
depends  on  the  situation.  For  example,  the  NN  rule  allows  the  impostor 
test  observations  which  are  near  the  main  speaker  observations  to  flucuate 
more  before  errors  result  than  the  DFTR  rules  allow.  However,  the 
DFTR  rule  allows  the  main  speaker  observations  to  fluctuate  more  toward 
the  nearest  impostor  observations  before  errors  result.  This  is  because 
the  hyperspheres  forming  R^  in  the  DFTR  procedures  actually  intersect 
the  closest  impostor  training  observations.  But  the  hyperplanes  forming 
R^  in  the  NN  procedure  are  constructed  midway  between  each  main 
speaker  and  each  impostor  training  observation. 

Considerations  that  tend  to  favor  the  DFTR  rule  over  the  NN  rule 

are: 

(1)  The  NN  rule  takes  5  times  longer  to  test  an  observation  than 
the  DFTR  rule.  (This  was  for  40  main  speaker  training  observations 
and  208  impostor  training  observations  in  a  48-dimensional  space.  )  Thus 
if  the  rules  are  compared  on  an  equal  computing  time  basis,  the  DFTR 
rules  are,  in  fact,  superior  in  performance.  Note  further  that  the  CH3- 
DFTR  procedure  in  measurement  space  2  is  superior  in  both  error  ra:e 
and  computing  time  to  the  NN  rule  in  the  reduced  space.  (The  DFTR 
procedures  in  measurement  space  2  took  2  times  longer  to  test  than  the 
DFTR  procedures  in  the  reduced  space.  See  section  4.11.) 

(2)  The  DFTR  rule  allows  the  machine  designer  to  know  how 
well  the  machine  is  expected  to  perform;  i.e.  it  gives  information  about 
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the  expected  false  alarm  and  the  expected  miss  rates. 

(3)  The  DFTR  rule  requires  storage  of  the  main  speaker  training 
observations  and  approximately  m  impostor  training  observations  (more 
for  the  CHS  procedure;  less  for  the  AHE  procedure)  whereas  the  NN  rule 
requires  storage  of  the  main  speaker  training  observations  and  all  of 
the  impostor  training  observations  . 

Other  decision  rules  were  applied  to  the  data  with  RF  as  the 
main  speaker.  The  rule  labeled  5  MS,  26  IM  in  Table  4.6  was  obtained 

i 

as  follows.  A  sample  mean  was  calculated  from  the  8  main  speaker 
training  observations  at  each  sitting.  These  5  sample  means  were  used 
as  main  speaker  reference  observations.  The  sample  mean  was  then 
calculated  from  the  training  observations  of  each  impostor.  This  gave 
26  impostor  reference  observations.  The  nearest-neighbor  rule  was  then 
employed  using  the  5  main  speaker  reference  observations  and  the  26 
impostor  reference  observations.  Very  good  results  were  obtained  from 
this  procedure  as  is  seen  in  Table  4.6.  This  indicates  that  the  observa¬ 
tions  from  each  sitting  were  well  clustered. 

For  the  rule  labeled  1  MS,  26  IM,  one  reference  vector  was 
calculated  from  the  main  speaker  training  observations.  The  26  impostor 
reference  observations  were  again  used.  The  nearest-neighbor  rule  was 
employed  and  a  rather  good  error  rate  resulted.  Note  that  the  miss 
rate  jumped  from  .0216  when  5  main  speaker  reference  observations  were 
used  to  .0973  when  1  main  speaker  reference  observation  was  used.  This 
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further  indicates  that  the  main  speaker  observations  from  all  sittings 
are  not  clustered  as  well  as  the  main  speaker  observations  from  one 
sitting. 

For  the  rule  labeled  1  MS,  1  IM  the  impostor  training  observa¬ 
tions  were  averaged  to  obtain  one  impostor  reference  observation.  A 
hypersphere  was  then  constructed  midway  between  the  main  speaker 
reference  observation  and  the  impostor  reference  observation.  Rather 
poor  error  rates  resulted  as  seen  from  the  table. 

4.  9.  Comparison  of  Expected  False  Alarm  Probability  with  Test  False 
Alarm  Rate 

One  of  the  most  important  purposes  of  the  speaker  recognition 

experiment  was  the  comparison  of  the  probability  of  false  alarm  for 

which  the  pattern  recognizer  was  designed  and  the  false  alarm  rate  which 

was  obtained  from  the  test  observations.  In  one  experiment  a  pattern 

recognizer  was  designed  using  208  impostor  training  observations,  and 

seven  blocks  to  form  region  R^.  Hence  this  pattern  recognizer  had  the 

following  characteristics;  where  P  denotes  the  false  alarm  proba- 

bility,  Pr(- )  denotes  probability,  E(»  )  denotes  expected  value,  and 

cr  (*  )  denotes  standard  deviation. 

Pr  (P  <  .055)  =  .95 
i  A  — 

'  E(PFA)  =  *°335 
a(PFA)  =.0124 
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The  test  results  for  the  pattern  recognizers  designed  by  the  three  hyper- 

sphere  DFTR  procedures  are  shown  in  Table  4.7.  The  last  two  rows 

in  this  table  show  the  results  for  pattern  recognizers  which  were  designed 

using  12  blocks  to  form  R^.  For  this  case  the  characteristics  were 

Pr  (P  <  .85)  =  .95 
FA  — 

e(pfa)  =  *  0574 
a(pFA)  =  -0160  • 

Consider  the  results  in  the  top  row  of  the  table.  Here  a  256- 
dimensional  space  consisting  of  4  spectra  with  each  spectrum  having  64 
frequency  components  was  used.  The  main  speaker  was  RF  and  m  =  7 
blocks  were  used  to  form  region  R^.  Using  the  training  sample,  a 
pattern  recognizer  was  designed  in  which  we  are  95%  confident  that  the 
false  alarm  rate  is  less  than  .055.  The  expected  false  alarm  rate  is 
.  0335.  Using  439  impostor  test  observations,  a  false  alarm  rate  of 
.  0342  was  obtained.  A  95%  upper  confidence  limit  can  be  obtained  from 
these  test  results.  For  this  particular  example  we  are  95%  confident 
that  the  false  alarm  rate  is  less  than  .  055. 

The  confidence  limit  on  the  test  results  can  be  obtained  as 


follows.  The  test  observations  are  assumed  to  be  independent.  Let  n  be 
the  total  number  of  impostor  test  observations.  Let  be  the  number 

of  impostor  test  observations  which  are  erroneously  classified  as  the 


main  speaker.  The  distribution  for  n^,^  is  given  by 


c_ .) 


nFA  n  -  nFA 

P  (1  -  P  ) 

FA  1  FA' 
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Table  4.7.  Comparison  of  Expected  False  Alarm 
Probability  with  Test  False  Alarm  Rate 


TRAINING  SAMPLE 

1  TEST  SAMPLE 

Situation 

95%  Upper 

Tolerance  Limit 

on  P_  . 

FA 

E(pfa> 

False 

Alarm 

Rate 

95%  Upper  Con¬ 
fidence  Limit 
on  Test  F alse 
Alarm  Rate 

4  spectra 
RF=MS,  m=7 

(1)  AHE 

.  055 

.  0335 

.  0342 

.  055 

(2)  OHC 

.  055 

.  0335 

.  0342 

.  055 

(3)  CHS 

.  055 

.  0335 

.0319 

.  05 

8  spectra 
RD=MS,  m=7 

(1)  AHE 

.  055 

.  0335 

.  0274 

.  045 

.  (2)  OHC 

.  055 

.  0335 

.  0228 

.  04 

(3)  CHS 

.  055 

.  0335 

.  0205 

.  035 

16  spectra 

RF =MS ,  m=7 

(1)  AHE 

.  055 

.0335 

.  0159 

.03 

(2)  OHC 

.  055 

.0335 

.  0091 

.  02 

(3)  CHS 

.  055 

.0335 

.  0181 

.03 

6  spectra 

R7'i-MS,  m=7 
(I)  AHE 

.  055 

.  0335 

.0319 

.  05 

(2)  OHC 

.  055 

.  0335 

.  0297 

.  05 

(3)  CIIS 

.  055 

.  0335 

.  0342 

.  055 

6  spc  c  tra 

JU  i  iS,  m=7 

(1)  AHE 

.  055 

.  0335 

.  0297 

.  05 

-.2}  CHC 

.  055 

.  0335 

.  0297 

.  05 

(3)  CHS 

.  055 

.  0335 

.0297 

.  05 

6  spectra 

RF  -MS,  m=12 

(1)  AHE 

.  085 

.0574 

.  08 

(2)  OIIC 

.  085 

.  0574 

.  04  55 

.  07 

(3)  CHS 

.  085 

.  0574 

.  0387 

.  06 

6  spectra 
JD=MS,  m=12 

(1)  AHE 

.085 

.  0574 

.  0388 

.  06 

(2)  OHC 

.  085 

.0574 

.  065 

(3)  CHS 

.  08  5 

.  0574 

.  0434 

.  065 
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See  Cramer  (1946).  The  maximum  likelihood  estimate  of  .  is  obtained 

FA 

by  equating  the  partial  derivative  of  the  above  equation  with  respect  to 
Pp^  to  zero.  The  maximum  likelihood  estimate  of  Pp^  is  given  by 


PFA  =  nFA/n2  * 


This  is  precisely  the  test  false  alarm  rate.  A  100t%  upper  confidence 
limit  can  be  defined,  with  p(*  )  denoting  the  probability  density  of  (•  ). 

Pr(PFA±0>  =  S/(WPFA)dPFA  =  T 

From  this  equation  it  is  possible  to  find  0  for  a  given  T  .  The  values 
of  0  for  r  =  .95,  which  are  listed  in  Table  4.7,  were  obtained  from 
graphs  found  in  Crow,  Davis,  and  Maxfield  (I960). 

The  most  important  results  from  this  table  are: 


(1)  All  test  false  alarm  rates  lie  in  the  95%  upper  tolerance  limit. 

(2)  In  all  cases  we  are  95%  confident  that  the  true  false  alarm  rate  lies 
within  the  limit  obtained  from  the  95%  upper  tolerance  regions.  A  note 
of  clarification  needs  to  be  made  about  this.  Assuming  that  the  test  false 
alarm  rate  stays  the  same,  the  upper  confidence  limit  on  the  false  alarm 
rate  (at  the  same  confidence  level)  will  decrease  as  more  observations 
are  tested.  Since  more  observations  were  used  in  the  test  phase  than  in 
the  training  phase,  we  want  the  upper  confidence  limit  on  the  false  alarm 
rate  at  confidence  level  95%  to  be  less  than  the  upper  tolerance  level  on 
the  false  alarm  probability  at  tolerance  level  95%.  This  is  the  result 
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which  was  obtained. 


(3)  (a)  All  test  false  alarm  rates  are  less  than  the  expected  false  alarm 

rate  plus  the  standard  deviation. 

(b)  In  18  out  of  21  cases  the  test  false  alarm  rate  is  less  than  the 
expected  false  alarm  rate. 

(c)  In  6  out  of  21  cases  the  test  false  alarm  rate  is  as  close  as  one 

observation  out  of  439  to  the  expected  false  alarm  rate.  That  is,  if  the 

test  false  alarm  rate  is  less  than  Ef  P  }  ,  one  more  impostor  which  is 

x  A 

classified  as  the  main  speaker  will  cause  the  test  false  alarm  rate  to  be 

greater  than  E{Pj_,^).  Likewise,  if  the  test  false  alarm  rate  is  greater 

than  E{Pfa),  one  less  impostor  which  is  erroneously  classified  will 

cause  the  test  false  alarm  rate  to  be  less  than  E[P^  .  )  . 

FA 


(d)  In  14  out  of  21  cases  the  test  false  alarm  rate  falls  in  the 
interval  [E  t  PFA3  -  cr  {  PFA) ,  E[Pfa)  +  a  (PFA)]  ■ 

(4)  (a)  In  the  15  cases  where  E  {  Pp^5  =  •  0335,  the  average  test  false 

alarm  rate,  FA^  ,  obtained  by  summing  the  15  test  false  alarm  rates 
and  dividing  by  1  5  was 

FAa.f  =  .0267  . 

AV 


This  gives  the  following  relation 


(b) 

be  .0438. 


FA 


AV 


8 


E  f  PFA5 


In  the  6  cases  where  E  {  P  }  =  .  0574, 

FA 

This  gives 


PA.,,  was  found  to 
A\ 
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FAa„  ~.76E 
AV 


{PFA5' 


(5)  There  arc  no  significant  differences  in  the  relationship  of  the  average 
test  false  alarm  rate  to  the  expected  false  alarm  for  the  various  DFTR 
procedures.  The  results  arc. 

F or  the  AHE  procedure  ,  FA^^.  ~  .  82  E  f  P^^} 

F or  the  OHC  procedure,  FA  .  75  E  {P.^^) 

For  the  CHS  procedure,  FA^^.~.78E  {P  }, 


4.  10.  Comparison  of  a  Measure  of  the  Expected  Miss  Rate  with  the  Test 
Miss  Rate 

A  method  was  presented  in  Chapter  3  for  obtaining  a  measure  of 
the  expected  miss  probability.  The  method  had  straightforward  applica¬ 
tion  to  a  classifier  constructed  by  the  AHE  procedure  since  all  hyper¬ 
spheres  making  up  region  R  have  the  same  radius.  However,  for  the 

C 4 

OHC  and  CHS  procedures  this  is  not  the  case.  For  these  procedures 
the  number  of  P^  blocks  contained  in  R^  varies  with  the  P^  observation 
with  which  the  ordering  starts.  An  obvious  way  to  overcome  this  dilemma 
is  to  average  over  the  number  of  blocks  which  are  obtained  when  each  P7 
observation  is  used  to  start  the  ordering.  This,  however,  is  usually  a 
prohibitively  time  consuming  process.  The  measures  of  the  expected 
miss  probability  for  the  classifiers  designed  by  the  OHC  and  CHS  proce¬ 
dures  were  obtained  principally  by  two  different  methods.  One  method, 
labeled  OHC- II  or  CHS-R  in  Tables  4.8a  and  4.8b,  uses  the  information 
about  the  radii  of  the  hyper  spheres  making  up  R^  to  order  the  P^ 
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observations.  The  method  labeled  OHC  or  CHS  (no  R)  in  Tables  4.  8a 
and  4.8b  does  not  use  information  about  these  radii  in  ordering  the 
observations.  The  ordering  functions  for  the  CHC  and  the  CHS  procedure 
are  given  by  equation  3.  51.  The  ordering  functions  for  the  OHC-R  and 
the  CHS-R  procedures  are  given  by  equation  3.  59. 

Table  4.  8a  compares  the  measure  of  the  miss  probability 
obtained  in  the  training  phase  with  the  miss  rate  obtained  in  the  test  phase 
for  the  3  different  256-dimensional  spaces.  Table  4.8b  compares  these 
quantities  for  the  experiments  in  the  48-dimensional  space.  Consider 
the  first  row  of  Table  4.8a.  For  the  AHE  procedure  with  main  speaker 
RF  and  seven  blocks  used  to  form  R^,  37  P^  blocks  were  found 
in  region  R^.  Note  that  there  is  also  at  least  part  of  an  additional  P^ 
block  in  R^.  The  expected  miss  probability  is  .0976  for  37  blocks  and 
,  0732  for  38  blocks.  The  miss  rate  for  the  185  main  speaker  test  obser¬ 
vations  was  .  0379.  The  95%  upper  tolerance  limit  on  the  miss  probability 
using  37  blocks  was  .18.  The  95%  upper  confidence  limit  on  the  test 

miss  rate  was  .07.  In  the  following,  the  larger  number  in  the  column 

+ 

of  expected  miss  probabilities  will  be  denoted  by  E  and  the  smaller 
number  by  E  .  Let  the  standard  deviation  corresponding  to  the  number 
of  complete  blocks  in  R^  be  cr+.  The  important  results  are: 

(1)  The  expected  miss  probability  for  the  OHC  and  the  CHS  proce¬ 

dures  is  much  too  conservative  if  the  information  about  the  radii  of  the 
hyperspheres  making  up  R^  is  not  used  in  ordering  the  P^  observations. 
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Table  4.8a.  Comparison  of  a  Measure  of  the  expected  Mis  s  Probability 
with  the  Test  Miss  Rate  in  the  256-Dimensional  Spaces 


( 
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Table  4.  8b.  Comparison  of  a  Measure  of  the  Expected  Miss 
Probability  with  the  Test  Miss  Rate  for  the  Space  of  6 
Spectra  with  8  Components /Spectrum 


Situation 

TRAINING  SAMPLE 

TEST 

SAMPLE 

Number 

of 

Blocks 

95%  Upper  E{P  } 

_  ,  m 

Tolerance 

Limit  on  P 
m 

Miss 

Rate 

95%  Upper 
Confidence 
Limit  on  the 

Miss  Rate 

RF=MS,  M=7 

(1) 

AHE 

32 

.  32 

.1951-. 

2195 

.  2972 

.  37 

(2) 

OHC 

OHC-R 

32 

.  32 

.1951-. 

2195 

.  1837 

.  25 

(3) 

CHS 

32 

.  32 

.1951-. 

2195 

CHS-R 

35 

.24 

. 1 220- . 

1463 

.  1243 

.  18 

JD=MS,  M=7 

(1) 

AHE 

18 

.  68 

. 5366-. 

5610 

.  3162 

.  38 

(2) 

OHC 

20 

.64 

.4878-. 

5122 

OHC 

20 

.64 

.4878-. 

5122 

.2991 

.  37 

(3) 

CHS 

20 

.64 

.4878-. 

5122 

. 

CHS-R 

20 

.64 

.4878-. 

5122 

.2991 

.  37 

RF =MS,  M=12 

(1) 

AHE 

36 

.  22 

.0976-. 

1220 

.  2000 

.  26 

(2) 

OHC 

32 

.  32 

.1951-. 

2195 

OHC-R 

38 

.15 

. 0488-. 

0732 

.  0650 

.  11 

(3) 

CHS 

32 

.  32 

.1951-. 

2195 

CHS-R 

37 

.18 

. 0732- . 

0976 

.  0595 

.10  ; 

JD=MS,  M=12 

i 

(1) 

AHE 

33 

.  30 

.1707-. 

1951 

.  1710 

.  225 

(2) 

OHC 

20 

.64 

.4878-. 

5122 

i 

OHC-R 

34 

.27 

. 1463- . 

1701 

.1625 

.  22 

! 

(3) 

ciis 

2  0 

.  64 

.4878-. 

FI  2  2 

1 

1 

CHS-R 

35 

.24 

. 1 220- . 

1463 

.  1  368 

.  19 
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Let  denote  the  average  test  miss  rate  for  the  7  cases  in  Tables 

8a  and  8b.  Let  E{P  }  be  equal  to  (E*  +  E  )/2  for  the  same  7  cases. 

m  AV 

For  the  OHG  and  the  CPIS  procedures,  the  following  results  are  obtained. 


OHC 

mav  =  -48 

E{P  ) 
m  AV 

CHS 

mav  =  • 40 

E{P 

m  AV 

These  are  compared  with  the  following  results  for  OHC-R  and  CHS-R  proce¬ 
dures. 


OHC-R 

mav  =  ' 84 

EfPm}AY 

CHS-R 

M  A  ,r  =  .68 
AV 

EfPm3AV  • 

The  result  for  the  AHE  procedure  is 


M  =  .94  E{P  1  . 

A V  m  AV 


It  is  therefore  concluded  that  for  a  realistic  estimate  of  the  expected 
probability  of  a  miss  that  the  radii  of  the  hyperspneres  making  up  R 
should  be  used  in  ordering  the  observations.  Hence  from  this  point 

on  v/e  will  only  consider  the  results  for  the  AHE,  OHC-R, and  CHS-R 
procedures . 


It  should  be  noted  that  the  lowest  expected  miss  rate  that  can  be 
achieved  with  the  40  main  speaker  training  sample  occurs  for  E  =  .  0224 
<ui  i  E  ~  .  0488.  This  occurs  when  39  blocks  are  contained  in  R^. 
Note  that  39  P^  blocks  arc  counted  in  for  the  OHC-R  and  CHS-R 
procedures  in  the  space  consisting  of  8  spectra  with  32  components  per 
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spectrum  and  for  the  CHS-R  procedure  in  the  space  consisting  of  16  spec¬ 
tra  with  16  components  per  spectrum.  In  both  spaces  the  test  miss  rate 
is  lower  than  expected.  This  is  part  of  the  reason  why  the  ordering 

procedure  gives  pessimistic  results  for  a  classifier  of  the  CHS  design. 
Other  important  results  are: 

(2)  All  test  miss  rates  lie  in  the  95%  upper  tolerance  limit. 

(3)  In  17  out  of  21  cases  we  are  95%  confident  that  the  true  miss  rate  lies 
within  the  limit  obtained  from  the  95%  upper  tolerance  regions. 

(4)  (a)  In  19  out  of  21  cases  the  test  miss  rate  is  less  than  E+  +  cr  . 

+ 

(b)  In  16  out  of  21  cases  the  test  miss  rate  is  less  than  E  . 

(c)  In  6  out  of  21  cases  the  test  miss  rate,  M  ,  satisfies 
E"  <  M  <  E+  . 

(d)  In  16  out  of  21  cases  the  test  miss  rate  satisfies 

-  +  +  + 

E  -cr  <M<E  +  (7  . 

4.  11.  7040-7094  Computer  Execution  Time 

The  approximate  7040-7094  DCS  execution  times  for  the  speaker 
verification  experiments  are  discussed  in  this  section.  The  execution 
time  lor  the  training  phase  is  approximately  equal  to  the  time  for  calcu¬ 
lation  of  the  spectra  plus  the  time  for  calculation  of  the  distances  between 
the  Pj  and  P^  observations  and  for  ordering  them.  Let  "  represent 
seconds  in  the  following  equations.  Let  MS  be  the  number  of  main 
speaker  training  observations  and  IM  be  the  number  of  impostor  training 
observations.  First,  consider  the  space  consisting  of  6  spectra  with  8 
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components  per  spectrum.  The  training  time  is  given  by 


Training  Time  =  4.4"*(MS  +  IM)  +  .  035"*(MS*IM) 

For  40  main  speaker  training  observations  and  208  importor  training 
observations,  a  training  time  of  23^  minutes  was  required.  During  the 
test  phase,  it  is  only  necessary  to  calculate  the  spectra  for  the  test 
utterance  and  to  compare  these  spectra  with  those  of  the  main  speaker 
training  utterances.  Let  UT  be  the  number  of  speakers  which  are 
tested.  Then  the  testing  time  is  given  by 

Testing  Time  =  (4.4"  +  .  035"*MS)  *  UT  . 

For  one  test  speaker  and  40  main  speaker  training  observations,  the 
testing  time  is  5.8  seconds.  For  the  624  test  speakers,  a  total  test  time 
of  approximately  60  minutes  was  required. 

For  the  space  consisting  of  4  spectra  with  64  components  per 
spectrum, the  following  equations  hold. 

Training  Time  =  2.9"*(MS+IM)  +  .  175"*(MS*IM) 

Testing  Time  =  (2.9"  +  .  1  75"*MS)*  UT 

For  40  main  speaker  training  observations  and  208  impostor  training 
observations,  a  training  time  of  36-^  minutes  was  required.  For  624  test 
speakers,  a  total  test  time  of  approximately  72  minutes  was  required. 
The  time  required  to  test  one  speaker  was  approximately  9.9  seconds. 

For  the  space  consisting  of  8  spectra  with  32  components  per 
spectrum,  the  following  equations  hold. 
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Training  Time  =  4.  9"*(MS  +  IM)  +  .  175"*(MS*  IM) 

Testing  Time  =  (4.9"+  .  175"*MS)  *  UT 

For  40  main  speaker  training  observations  and  208  impostor  training 
observations,  a  training  time  of  approximately  45  minutes  was  required. 
For  624  test  speakers,  the  total  test  time  was  approximately  120  minutes. 
The  time  required  to  test  one  speaker  was  approximately  11.9  seconds. 

For  a  space  consisting  of  16  spectra  with  each  spectrum  having 
16  components,  the  following  equations  hold. 

Training  Time  =  7.  4"*(MS  +  IM)  +  .  1 75"*  (MS*IM) 

Testing  Time  =  (7.4"+  .  175"*MS)*UT 

For  40  main  speaker  training  observations  and  208  impostor  training 
observations,  a  training  time  of  55  minutes  was  required.  For  624  test 
speakers,  the  total  test  time  was  approximately  150  minutes.  The  time 
required  to  test  one  speaker  was  approximately  14.4  seconds. 

On  the  whole,  these  execution  times  seem  satisfactory  for  a 
practical  automatic  speaker  verification  system.  However,  the  IBM  7094 
computer  is  probably  larger  and  faster  than  a  computer  which  is  likely 
to  be  used  in  an  automatic  speaker  verification  system. 
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Chapte  r  5 


THEORETICAL  COMPARISON  OF  THE  PROBABILITY  OF 
ERROR  FOR  THE  AHE-DFTR  PROCEDURE  WITH  THE 
PROBABILITY  OF  ERROR  FOR  THE  NEAREST 

NEIGHBOR  RULE 

5.  1.  Summary 

In  this  chapter  the  small  sample  performance  of  the  AHE-DFTR 

procedure  is  investigated  and  compared  to  the  performance  of  the  nearest- 

« 

neighbor  rule  for  the  two-class  situation. 

Suppose  n^  independent  observations  are  available  from  class  1 

and  n^  independent  observations  are  available  from  class  2  ,  both  on 

the  real  line.  Let  P  ,  the  false  alarm  probability,  be  the  conditional 

-T  A 

probability  that  a  new  observation  V  is  classified  into  class  2  given 
that  V  belongs  to  class  1  .  Let  P^.  ,  the  miss  probability,  be  the  con-* 
ditional  probability  that  V  is  classified  into  class  1  given  that  V  belongs 
to  class  2  .  See  equations  1.  5  and  1.  6.  The  following  results  are 
obtained  without  specifying  the  class  probability  distributions. 

(1)  The  false  alarm  probability  and  the  miss  probability  are 
derived  for  the  m-block  AHE-DFTR  procedure  when  n^  independent 
observations  are  available  from  class  1  and  one  observation  is  available 
from  class  2  . 

(a)  For  the  above  conditions  the  nearest-neighbor  (NN)  rule 
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and  the  one-block  DFTR  procedures  yield  identical  mis  s^  probabilities. 

(b)  This  leads  to  the  result  that  under  these  conditions  the 
miss  probability  for  the  m-block  AHE-DFTR  procedure,  m  >  1  ,  is  less 
than  or  equal  to  the  miss  probability  for  the  NN  procedure. 

(2)  For  n^  >  1  independent  observations  from  class  2  and  n^ 
independent  observations  from  class  1  an  intuitive  comparison  is  made 
of  the  miss  probabilities  for  the  NN  and  AHE-DFTR  procedures.  The 
result  is  that  the  miss  probability  for  the  NN  rule  is  less  than  or  equal 

to  the  miss  probability  for  the  one -block  AHE-DFTR  procedure. 

\ 

(3)  For  one  observation  from  class  2  and  one  observation  from 

class  1  it  is  shown  that  th  DFTR  false  alarm  probability  P  ^  is  equal 

to  one-half.  Furthermore,  for  two  independent  observations  from  class 

2  and  one  observation  from  class  1  it  is  shown  that  P  =  -q-  .  This 

r  A  L 

indicates  that  the  number  of  class  2  observations  does  not  affect  P^,^, 
a  result  expected  from  a  consideration  of  distribution-free  tolerance 
region  theory. 

(4)  It  is  further  shown  that  the  false  alarm  probability  for  the 
DFTR  rule  can  be  less  than  the  false  alarm  probability  for  the  nearest- 
neighbor  rule. 

The  results  for  P  for  the  DFTR  procedures  hold  regardless 

Jc  A 

of  the  dimensionality  of  the  space.  The  comparison  of  P  for  the  NN 
rule  with  P  for  the  DFTR  procedure  holds  regardless  of  the  dimen- 
sionality  of  the  space  as  long  as  the  same  metric  is  used  for  both  procedures. 
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5.2.  The  Ncarost-Neighbor  Rule 

Under  the  nearest-neighbor  rule  a  new  observation  is  classified 
into  the  class  of  its  nearest  neighbor.  The  nearest  neighbor  is  that 
observation  which  is  closest  to  the  new  observation  in  some  metric  dis¬ 
tance.  The  metric  distance  used  in  this  chapter  is  Euclidean  distance. 

Suppose  there  exist  two  classes,  class  1  described  by  the 
probability  distribution  F  (z),  and  class  2,  described  by  F  (z).  Let 
z  be  a  variable  on  the  real  line.  Suppose  one  observation  is  available 
from  each  class.  Let  the  observations  be  denoted  by  X^  and  Y^  ,  where 
is  from  the  population  with  distribution  F^  and  is  from  the 
population  with  distribution  F^  .  Let  V  be  a  new  observation.  Now 

let  P  be  the  conditional  probability  that  V  is  classified  into  class 
x  A. 

2  given  that  V  belongs  to  class  1  and  let  P^.  be  the  conditional  proba¬ 
bility  that  V  is  classified  into  class  1  given  that  V  belongs  to  class  2. 
Also  let 


\  =  |xr  v| 


(5.1) 


and 

Wj  =  |YX-  V  I  .  (5.2) 

Then  P  is  the  probability  that  W.  <  Z  given  V  ~  F  ,  where  V  ~  F 

is/L  1.  1  C*  £ 

denotes  that  V  is  from  the  probability  distribution  F^.  Likewise, 

P  is  th  '  probability  that  W  >  Z.  given  V  ~  F  .  The  probability 
1*  A  11  1 

distribution  of  Z^  given  that  V  =  v  is 

FZ  (zl/v)  =  |F2(V+Zl)  '  F2{V-Z1}  Z1  >  °  (5*  3) 

*  0  otherwise. 
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« 


The  probability  density  of  Zj  given  that  V  =  v  is 


fz  <Vv)=  (  f2(v+Zl)  +  f?(v-Z]L)  z^O 

l  0  otherwise  . 


(5.4) 


A  similar  distribution  and  density  are  obtained  for  W  ,  where  W  is 
substituted  for  Zy  is  substituted  for  ,  and  1  is  substituted  for 

2  .  Now  let 


u  =  zL  -  wx  . 


(5.5) 


The  miss  probability  is  the  probability  that  U  >  0  given 
The  probability  distribution  for  U  given  V  =  v  is 


V  ~  F„  . 


®  u+w 


_  -  »  »»  j 

Fu(u/v)  =  VWlS0  dzlfZW(zl'wl/v) 


(5.6) 


Since  Z ^  and  are  independent  given  V  =  v, 

00 

Fu(u/v)  =  ^  dWjFz  (u  +w j / v )  f^y  (Wj/v)  . 

The  miss  probability  is  given  by 

00 

PM  =  \  f2(v)dv  t1  -  FU(0/V)] 


where 


00 


Fu(0/v)  =  ^  dw^F^v+w^  -  F2(v-w1)][f1(v+w1) +f1(v-v.-1)] 


Since 


00 


^  dWjff^v+Wj)  +  f^v-Wj)]  =  1 


(5.7) 


(5.8) 


(5.9) 


theniss  probability  is  given  by 
00  ® 

PM=  S  dwl^  "  f2(v+wi)  +F2(v-Wi)]  [f^v+w^)  tf^v-w^]  (5.10) 


-  00 


A-147 


A  similar  analysis  can  be  made  for  the  false  alarm  probability  with  the 
result 


(5.11) 


Equivalent  results  for  Pw  and  .  can  be  obtained  bv  letting 

M  FA  ' 

U  =  W  -  .  These  results  are  listed  below  because  they  are  used  in 

a  later  analysis. 


(5.12) 


oo  c 

FA  =  J  f^v)dv\  dz1[1"F1-v+z1)  +F1(v-z1)][f2(v+z1)  +f2(v-z1)] 
—  00  0 


(5.13) 


The  fact  that  equations  5.12  and  5.10  and  equations  5.13  and  5.11 
are  rcsp  ,tively  identical  can  be  seen  by  integrating  equations  5.10  and 
5.  11  by  parts  and  using  equation  5.  9. 

Suppose  n^  statistically  independent  observations  X^,  .  .  .  , 

2 

are  available  from  class  2  and  n^  statistically  independent  observations 

Y.,  .  .  .  ,  Y  are  available  from  class  1.  Let  V  be  a  new  observation. 

1  ni 

Now  let 


Z.  =  |  X.-  V 

l  l 


i  =  1,  . .  .  ,  n. 


(5.14) 


and 


W.  =  Y.-  V 

l  l 


i  =  1,  .  .  .  ,  n. 


(5. 15) 


Also  let 


and 


Z  =  min  Z. 
l<i<n2 

W  =  min  W. 

i  1 

1  <  i  <  n^ 


(5. 16) 


(5.17) 
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is  the  probability  that  W  <  Z  given  V  ~  F^  .  PpA  is  the  Pr°b- 
ability  that  W  >  Z  given  V  ~  . 

Consider  the  problem  of  finding  the  probability  distribution  for 
Z  when  Z  =  min  (Z  ,  Z  )  .  Figure  .5. 1  shows  the  region  for  min  (z  ,  z ?  )<  z 
Then 


Fz(Z)  = 


F  (z)  +  F  (z)  -  F 

Z1  Z2 


Z1Z2 


(z,  z). 


(3.18) 


If  the  variables  Z.  and  Z_  are  independent  and  identically  distribute; 

with  distribution  F  ,  the  probability  distribution  for  Z  is  given  bv 

zi 


Fz(z)  =  1  -  [1  -  Fz  (z)]  . 


(5.19) 


Suppose  Z  =  min  (Z^,  Z^,  Z^)  .  Then 


F_(z)  =  F  (z)+F  (z)+F  (z)-F  (z,z)-F  (z,z)-F  (z,  z) 

Z  Z1  Z2  Z3  Z1Z2  Z1Z3  Z2Z3 


+  F  „  (z,  z,  z) , 

Z1  2Z3 


(5.20) 


If  the  variables  Z  ,Z  ,  and  Z  are  independent  and  identically 

1  if 


distributed  with  distribution  F 


V 


Fz(z)  =  1  -  [X  -  F  (z)]' 


(5.21) 


These  equations  are  easily  extended  for  Z  =  min  Z.  . 


Again  consider  finding  the  distributions  of  Z  and  W  given  ir. 
equations  5.14  through  5.17.  The  probability  distribution  for  Z^  ,  given 
V  =  v  ,  is  given  by  equation  5.  3.  Since  Z  ,  Z_,  .  .  .  ,  Z  are  independent 

i  n. 
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The  Transformation  Z 


and  identically  distributed  given  V  =  v  ,  the  probability  distribution  of 


Z  given  V  =  v  is 


n. 


Fz( z/v)  =  1  -  [1  -  Fz  (z/v)] 

The  probability  density  for  W  in  equation  5.17  is  easily  seen  to 


n.-l 


fw<w/v)  =  "  Fw  (w/v)]  1  fw  (w/v) 


Let  U  =  Z  -  W  .  Then 

»  u+w 


Fu(u/v)  =  ^  dw  ^  dz  fw  z(w,  z/v) 


and 


CO 


PM  =  [  f2(v)dv[l  -  F  (0/v)] 

‘ ’ L  *i oo 


When  the  proper  substitutions  are  made,  one  finds 

®  n. 


00 


Pj^,=  n^  ^  f^(v)  dv^  dw[l  -  F2(v+w)  +  F2(v-w)]  * 


n.-l 


[1  -  F^(v+w)  +  F  (v-w)]  *  [f^(v+w)  +f^(v-w)]  . 


By  a  similar  procedure  one  obtains 


«o  »  n 

PFA=  n2  ^  f1(v)  dv  C  dz[l  -  F^(v+z)  +  Fj(v-z)]  * 

-  00  ^0 

n_-l 

[1- F2(v+z)  +  F2(v-z)]  [f2(v+z) +f2(v-z)]  . 


5.  3.  The  AHE-DFTR  Procedure 


Now  consider  the  AIIE-DFTR  procedure.  Suppose  one  cb 
is  available  from  each  class.  Let  V  =  v  be  a  new  observation. 


(5.22) 

be 

(5. 23) 

(5.24) 

(5.25) 

(5.  26) 

(5.27) 

se  rvation 
By  the 
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AHE-DFTR  procedure  the  observation  V  is  assigned  to  class  1  if 

|v  -  Xjl  >  |Yr  Xj|  .  (5.28) 

Let 


Z  =  Y 


i-  xi 1 


(5.29) 


and 


W  =  V 


-  Xjl  • 


(5.  30) 


Then  a  miss  error  is  made  when  W  >  Z  given  V  ~  F  .  Let  U  =  W  -  Z 

w 

then  a  miss  error  is  made  when  U  >  0  given  V  ~  F  .  Proceeding  as 

c* 

before  with  V 


where 


00 

Fu^/xj)  =  ^  Fw(u+z/Xl)  i^z/x^dz  , 


F^tu+z/xj)  =  [  F^x^+u+z)  -  F^x^-u-z)  w  >  0 

'  I,  0  otherwise 


(5.  31) 


and 


{ 


f^/Xj)*  |  fjfXj  +  z)  +f1(x-z) 

0 


z  >  0 

otherwise 


Then 


00  CD 

pm=S  f2<xi,dxi$  [1-F2(xi+z)+F2(xrz,:i[fi<Vz,+fi(xrz)]dz-  (5-32) 

—  CO  0 


This  equation  is  identical  to  equation  5.10.  Therefore,  given  one  sample 
from  each  class,  the  AHE-DFTR  procedure  has  the  same  miss  probability 
as  the  NN  rule. 

Consider  the  error  when  V  is  classified  into  class  2  and  V ~  F^. 
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Now 


FW(w/xi>  =  (  ^(Xj+w)  -  w  >  0 

(  0  otherwise  (5.33) 

and 

CD 

FU(u/xl)  =  50[Fl(xl+U+Z)  ■  F1<Vu-z>][f1(Vz)  +f1(x1-^)]dz,  (5.  34) 

and 

PFA  =  5dxl  f2(xl,Fu(0/xl)  •  (5.35) 

Then 

CD  CO 

PpA=  W  VF1(X+Z)  -  ',«Y  ■«  tW2>  ♦  W>  ]  da  .  (5.  36) 

Noting  that  the  last  integral  is  of  the  form  Judu  ,  the  following  result 
is  obtained. 


FA 


(5.  37) 


This  is  as  expected  from  the  DFTR  theory  since  one  block  is  used  to 
form  region  . 


OjClass  1  Observations,  1  -  Class  2  Observation 


Now  suppose  nj  independent  identically  distributed  observations 

are  available  from  class  1  and  one  observation  is  available  from  class  2. 
Let 


and 


n. 


(5.  38) 


(5.39) 
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Let 


Z  =  min  Z.  .  (5.40) 

l<i<n^  1 

Suppose  V  ~  .  Then  a  miss  error  is  made  when  W  >  Z  .  Let 

U  =  W  -  Z  .  Then 


VVV =  )  Fi(xi+zi>  -  Fi(vzi> 

0 


and 


FW(w/xl)  = 


I 

|F2(x1+w)- 


—  0  i  -  1,  ...»  n^ 
otherwise  (5.41) 


F2<Y  w> 


w  >  0 

otherwise  . 


(5.42) 


Since  the  Z^  conditioned  on  are  independent,  it  is  seen  from 
equation  5.  21  that 


n. 


A  z/^) 


=  1  -  [1  -  F, 


Zl(z/xi)] 


(5.43) 


Fy(u/x^)  is  equal  to 


W  U  +  Z 

VA)  =  ^  dz  J  dwfw(w/x1)fz(z/x1)  . 


(5.44) 


Substituting  the  density  for  into  equation  5.44  and  integrating 

over  w  ,  it  is  found  that 


CO 


V1 


u 


(u/Xf)  =  J  dzFw(u+z/x1)n1[l-  Fz  (z/x1)]  1  fz  (z/xx)  .  (5.45) 


Substituting  u  =  0  in  equation  5.45  and  noting  that 

r“  ,  nl“ 1 

^  n^l  -  Fz  (z/x^ )]  fz(z/x^)dz=l  one  obtains 


0 
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(5.46) 


PM  "  S-dxl£2(xl)  J^^fl-F^Xj+zJ  +  F^Xj-*)]  * 

v1 

V1  ■  Fi*xi+  z) +Fi(xr z)J  [fjtXj+K) +fi<x1-  *)]• 


Note  that  equation  5.  26  with  n^=  1  is  identical  to  equation  5.46. 

Therefore,  for  n^  independent  observations  from  class  1  and  one  observa¬ 
tion  from  class  2,  the  NN  rule  and  the  one  block  DFTR  procedure  give 
identical  miss  probabilities. 

Let  region  be  the  union  of  more  than  one  block  (m  >  1)  . 

Region  becomes  larger  as  each  block  is  added.  Therefore,  for 
-m  >  1  ,  n^  class  1  observations,  and  one  class  2  observation,  the  AHE 
miss  probability  is  less  than  or  equal  to  the  NN  miss  probability, 


DFTR  NN 

M  -  M  * 


(5.47) 


We  next  consider  the  false  alarm  probability,  •  Now 


Fw(w/x,)  = 


j)  =  (F^+wJ-FjfXj 

(  0 


-  w)  w  >  0 

otherwise  . 


Then 


00 

Fu«>/*i)  =  ^  dz[F2(x1+u  +  z)  -  F^(x^-  u  -  z)]  n^ 


n,-  1 


^1"Fl(Xl+Z)  +Fl(Xl‘  Z)]  1  [fi(xi+  z)  +  fi^xi"  * 


(5.48) 


(5.49) 


Integrating  by  parts  one  obtains 


FA 


*1 


+  1  * 


(5.  51) 


This  is  the  result  predicted  by  the  DFTR  theory  since  one  block  out  of 
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the  possible  n^+  1  blocks  is  used  to  form  region  R^  . 

Let  region  R^,  be  the  union  of  m  blocks  where  the  AHE  ordering 
procedure  is  used.  Let  Z  be  the  smallest  of  the  Z  ,  i  =  1,  .  .  .  ,  n^. 

Denote  this  by 

Z=  min  (m)Z.  .  (5.52) 

l<i<n^  1 


From  the  theory  of  order  statistics,  (c.f.  Kendall  and  Stuart  (1958), 
p.  252)  the  probability  density  for  Z  is  given  by 


n. 


fZ<Z> 


/ - 7777 - 77  F„  fz)”1-1  [1  -  F  (z)]  1 

(m-l)l  (n^-m)l  Z  '  1  Z^  ,J 


n  -  m 


i7  (*)  . 
•  1 


(5.53) 


Substituting  equation  5.  53  into  equation  5.35  one  obtains  when  m 

blocks  are  used  to  form  R^  by  the  AHE  procedure. 

PM(m)  =  S  dxlf2(xl}  5  dzt1"  F2(xi+z)  +F2(xr  Z)J  (m-1)!  (n'-m)I 

®  U  X 


n  -  m 


(5.  54) 


fE^x^  z)  -  z)]m  *[1  -  F^+z)  +F1(x1>  z)]  1  [f^x^  z)  +  f^xy  z)] 


Similarly,  for  m  blocks  is 


PFA(m)  =  S”  dXl  W  L  dz[Fl(Xl+  Z)  -  Fl(xf  Z,J 
-  00  0 


n  ' 


r 


(m-l)!  (n^-  m)! 


nm-l 


n^-  m 


(5.55) 


[F^x^  z)  -  F1(x1- z)]  (I  "  Fl(xl+  Z)  +Fl^Xl‘  Z)j  [^(x^z) +f1(x1- z)]. 


'1  he  integral  over  z  is  simply  the  expected  value  of  a  random  variable 
from  a  Beta  distribution  with  parameters  m  and  n^-m  +  1  .  Then 
m-block  false  alarm  probability  is  equal  to 
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I 


Pr.(m)  =  -^r- 
FA  xij+1 


DFTR  Procedure  for  >  1 


(5.  56) 


We  now  use  the  same  techniques  to  investigate  the  DFTR  probability 
of  error  when  more  than  one  class  2  observation  is  used.  The  equations 


become  very  complex  and  can  not  be  carried  through  in  general  (i.e. 


without  assuming  some  probability  density  function).  This  section  is 
included  only  to  show  some  of  the  difficulty  which  is  involved.  An  intuitive 
investigation  into  the  problem  is  made  in  the  next  section. 

Suppose  two  independent  observations  and  are  available 
from  class  2  and  one  observation  is  available  from  class  1.  Let  X^ 
and  X^  be  given,  with  X^  <  X^  .  Let 


Z. 

l 


-  lvx.1  l  =  1>2 


Z  =  minfZ^,  Z^) 
Wj=  |V-X.| 

W  =min(W1,W2) 


i  =  1,2 


(5.57) 


The  distribution  of  Z  given  X^  <  X^  is,  from  equation  5.18, 


F_(z)  =  F  (z)  +F  (z)  -  F  (z,  z). 
Z  Z1  Z2  Z1Z2 


F  (z,z)  has  a  nonzero  value  only  when  z 
Z1Z2 


(5. 58) 


>  — - — .  F„  „  (z,  z)  is 


Z1Z2 


equal  to  the  F^  distribution  in  the  region  of  overlap  of  the  intervals 
expanding  from  each  x  . 
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Since  X,  <  X- 
1  2 


FZlZ2(z’z/xl<x2>  "  f 


z  >• 


X2"X1 


otherwise. (5.  59) 


Then 


FZ(Z/X1<X2)=  I  Fi(x2+z)  "  z) 


z  > 


X2"  Z1 


2 


X2"  X1 


*J(x2+  z)  -  ^(x2-  z)  +IJ(x1+z)  -  z)  0  <  z  < - — 

0  otherwise.  (5.60) 


The  above  equation  could  have  been  obtained  directly  fiom  Figure  5.  2  . 
Error  E^  is  obtained  when  W  <  Z  and  V  ~  F^  .  In  this  case 


x  - 


F^.(w/x^  <  )=  l  ^(x2  +  w)  -  ^(x1-  w) 


w  > 


Let 


and 


2-*l 


X2'X1 


IJ(x2  +  w)  -  ^(x2- w)  +IJ(x^.+  w)  -  ^(Xj-  w)  0<w<  — — 

0  otherwise.  (5.  61) 


U  =  W  -  Z 


(5.62) 


where 


dFz(z/x1<x2) 

(fz 

z  >  m 

dz  1 

tfz 

0  <  z  <  m 

dvwvx2)  i 

;<v 

w  >  m 

dw  j 

1  f“ 

«  w 

0  <  w  <  m 

X 

2'  *1 

(5.  63) 


m  = 
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From  Figure  5.  3  the  distribution  for  U  <  0  given  x  <  x.  is 

X  W 
z 

0 


m  z  »  m 

FU(0/X1  *  X2>  “  LdZ  SndwfWfZ  +  S  dZ  L  dwfZ  fw  + 


_00  , 

Ldz  $. 


m  m 


m 


dwfz  fw 


(5.  64) 


These  integrals  are  easily  solved.  Consider  the  first  double  integral. 
az  a  ni 

'0  ^0  ”  “ 

[f1(x2+z)  +fx(x2-  z)+f1(x1+z)  +fJ(x1-  z)]  (5.  65) 


pni  az  a  ni 

L  dz  1  dwVfZ  =  L  dz[Fl(X2+  Z)  ■  F1(X2-  z)  +  Fl'V  Z>  -  ¥<V  z)]  * 


Recognizing  that  this  integral  is  of  the  form  ^udu  one  obtains 

m  r>Z  1  -7 

1  "  g*  *  r  m  t  ft  t  ft  .  m  /  .  I  m  •  i  1“ 

dwfWf7  = 

0  ^0 


r*  ^  pZ  .  „  i  2 

^  dz  ^  dwfwfz  =  2[^(x2+m)  -  ^(x2-m)  +  ^(x1+m) m)]  (5.66) 


Consider  the  second  double  integral  of  equation  5,64.  By  straightforward 
integration  one  obtains 


r-  r\rn  X  _ 

\  dz  \  dwfzfw  =  [^(x2+m) -^(x2-m)+IJ(x1+m)-IJ(x1- m)]* 
Jm  J  0 

[1  -  ^(x2+m)  +*J(x1-  m)]  . 


(5.67) 


Now  consider  the  final  double  integral  of  equation  5.64.  Integrating  with 
respect  to  w  one  obtains 


\  dz  C  dWfzfw=  [  dz  [^(x2+z)-^(x1-z).  ^(x2+m)+IJ(x1-m)];:: 

[fx(x2+  z)  +  ^(3^-  z)]  •  (5.  68) 


m  wm 


Let 


U  =  ^x2+z^  “ 
du  =  f1(x2+z)  +f1(x1-  z)  . 
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Then  the  integral  becomes 

+  +  pi 

Iwf  '  r 

'm  wm 


[  dz  \  dw44  =  C  »du  -  [^(x2+m)  -  E(x.-  m)][l -  F  (x,+m)  +  F  (x, -  m)l 

m)  11  12  11 

(5.69) 


After  integrating  and  collecting  terms,  one  obtains 

;"d2  r  d  -+  -+  1 

m 


r-  r  dw44 =i[i-Fi(vm,+ii(vm)i2  • 


Let 


Fy  =  F1(x2  +  m)  -  F^x^-m) 
Fx  =  F^(x^+m)  -  F1(x2-m) 


Then 


FyfO/^  x2)  =  i  f(Fx+Fy)  +  (1-  Fx)2  +  2(Fx+Fy)(1  -  Fx)} 


Note  that 


and 


Therefore 


Then 


and 


FA 


CO  "2 

“  2  S  dx2f2(x2)5  dxlf2  (xl)2 

-CO  -CO 


(5.70) 


(5.71) 


=  |(fy+1)2- 

,  X2~  *1 
Vm  =  V  2  = 

Xl+X2 

2 

X2'X1 

x2  -  m  =  x2  -  2 

_VX2 

2 

F  =F(Xl+X2).F 
Y  1\  2  /  1 

(5.72) 

FU(0/xl<  x2>  =  I 

(5.73) 

2  • 


(5.74) 
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This  is,  of  course,  the  answer  which  was  expected  from  a  consideration 


of  DFTR  theory.  This  answer  would  be  expected  regardless  of  the 

number  of  class  X  observations  used  to  design  the  classifier. 

We  now  wish  to  obtain  an  expression  for  the  miss  probability  for 

one  class  1  observation  and  two  class  2  observations.  F  (w)  i"  now 

W 


given  by 


Vw)  = 


F2(x2+w)-F2(Xl-w) 


w  > 


VX1 


-  z 


X  - 


2-*l 


F^Cx^+w)  -  F^tx^- w)  +  F2(Xf+ w)  -  F2(Xi~  w)  0  < w  <  2 

0  otherwise.  (5.75) 

F  (0/x  <  x  )  is  given  by  equation  5.64.  The  first  double  integral  in  that 
U  1  L  * 

equation  is  equal  to 


P  m  .-»z  —  —  m 

^  dz^dwf^fz=^  dz[F2(x2+z)-F2(x2-z)+F2(x1+z)-F2(x1-z)]':' 

[f1(x2+z)  +f1(x2-  z)+f1(x1+z)+f1(x1-  z)]  .  (5.76) 


The  second  double  integral  is  equal  to 


,oo  rn 


\  dz  V  dwf^fz=  [F2(x2+m)-F2(x2-m)+F2(x1+m)-F2(x1-m)]* 

J  m  J  o 

[l-F1(x2+m)  +  Fi(xi-  m)]  ,  (5.77) 


The  last  double  integral  is  equal  to 

to  z 


co 


[  dwf\V  fZ  =  \  dztfi^x2+  z)  z)][f2(x2+z)  “  ^  "  F2(x2+m^  +  F2^x1 

Jm  1  m  Jm 

=  "  ^FZ^x2+m^  "  F2^X1-  ^  "  IJ(x2+rn^  +Fl^xr 

+  ^  dz[f1(x2+z)  +fi(x1-  z)][F2(x2+z)  -  F2(x1-  z)]  ,  (5.78) 


-  m)] 
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VX1 

Combining  terms  and  substituting  m  =  — - —  ,  one  obtains 


V*i  . 


Fu(0/x1<  x2)  =  J  dz[F2(x2+z)-F2(x2-z)+F2(x1+z)-F2(x1-  z)][fj(x2+ z) 

00 

+  f1(x2- z) +f1(x1+z)+f1(x1-  z)  +  ^  dz[F2(x2+z)  -  F2(x1- z)]^ 

VV 

[f1(x2+z)  +  f]L(x1- z)]  2 


CD 

Fu(°/xi<  x2)  =  J  dz[F2(x2+z)  -F2(x1-  z)][f1(x2+z)+f1(x1- z)] 

vv 

+  ^  2  dz[F2(x2+z)  -  F2(x1- z)][f1(x1+z)+f1(x2- z)] 

VV 

+  J  2  dz[F2(xx+  z)  -  F2(x2-  z)]  [f1(x2+  z)  +i1(x1-  z)] 

+  ^  2  dz[F2(x1+z)-F2(x2-  zJlff^Xj+z)  +  fx(x2-  z)].  (5.79) 

PM  is  given  by 


PM  =  ZS  dx2Fz<x2)(\  Zdx1f2(x1)[l-Fu(0/x1<x2)]  .  (5.80) 

-CO  -00 


This  equation  has  not  been  reduced  so  that  a  meaningful  comparison  can 

DFTR  NN 

be  made  between  and  without  any  assumption  on  the 

probability  distributions.  Hence  an  intuitive  comparison  is  made  below 
of  the  miss  probabilities  for  the  DFTR  and  NN  procedure  when  n2 
class  2  observations  and  n^  class  1  observations  are  available. 
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5.  4 .  Intuitive  Investigation 


Intuitive  justification  of  these  results  can  be  easily  produced. 

For  example,  consider  again  the  case  when  one  observation  is  available 
from  each  class.  Let  a  new  observation  be  denoted  by  V  .  The  NN  rule 
compares  |V-X^|  with  |V  -  Y^|  .  The  DFTR  procedure  con.pares 
Jx^-Y^|  with  |X^-V|  .  Let  V  ~  and  replace  V  by  X  .  Then 

PMFTR  =  PrtdXj-Yj-IX-Xj)  <  0] 


and 

P™  =Pr[,|X-Y1|-  IX-XjXO]  -  (5.81) 

Note  that  the  random  variable  |  X^  X|  -  |X^-  ^  |  conditioned  on  X^ 
has  the  same  statistics  as  the  random  variable  |X-X^|  -  |X-Y^| 
conditioned  on  X  .  Since  X^  and  X  are  independent  identically  dis¬ 
tributed  random  variables,  Pw  for  the  DFTR  rule  is  equal  to  P.  ,  for 

M  M 

the  nearest-neighbor  rule,  a  result  which  was  obtained  formally  in  the 
first  part  of  this  chapter.  Now  let  V  ~  F^  and  replace  V  by  Y  .  Then 
the  false  alarm  probability  for  the  DFTR  procedure  is 


pFATR  =  Pr[(lxrYil '  lxryl )<o1  =i’  (5-82) 

another  result  previously  obtained. 

Now  consider  a  problem  for  which  no  results  were  obtained. 

The  problem  is  to  compare  P,,  for  the  DFTR  rule  and  for  the 

M  M 

nearest-neighbor  rule  when  more  than  one  observation  is  available  from 
class  Z.  Specifically,  suppose  two  observations  are  available  from 


\ 
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class  2  and  one  observation  is  available  from  class  1.  Let  V  ~  F.  and 

M 

replace  V  by  X  .  Then 
DFTR 

PM  =Pr[min(|X1-Y1|,|X2-Y1|)-min(|X-X1|X-Y2|)  <0]  (3.83) 

and 


P^N=Pr[|X-Y1|-min(|X-X1|,  |X-X2|  <0] 


(5.84) 


The  term  immediately  preceding  the  inequality  sign  is  the  same 
in  both  equations.  Note  that 

PrlmindXj-Yjl  ,  |X.,-  Yj| )  <  0]  ^FrflX-Yj  <  0]  . 

The  dependency  in  the  terms,  min(|X  -  Y  |  ,  |X  -  Y  |  )  and 

XI  M  1 

min(  |X-X^|  ,  |X-  X^  |  )  in  the  first  equation  is  through  X^  and  X^  . 
The  dependency  in  the  terms  |X-Y^|  and  min(|X-X^|,  IX-X^I) 
of. the  second  equation  is  through  X  .  Since  X,  X^,  and  X^  are 
independent  identically  distributed  random  variables,  the  conclusion  is 


made  that 


NN  DFTR 
P  <  P 
M  ~  M 


For  the  general  case  of  n  pbservations  from  class  1  and  n_  observe- 

'  1  w 


tions  from  class  2,  P  for  the  DFTR  procedure  is 


DFTR 


=  Pr[min|X.- Y.l  -  min  |X  -  X.  |  <  0] 
1  l  j '  '  i 1  J 


(5.85) 


l<i<  n 
l<j<nx 


Ui<n2 


and  P.,  for  the  nearest-neighbor  rule  is 
M 
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(5.86) 


-M  =Rr[™in  |X-X|.  min  |X-X.  |<0] 
— 3  l<i<n  1 

M 

Therefore,  the  genera!  conclusion  is  that  for  n  >1 


PNN  <  pDFTR 
M  - 


(5.  87) 


This  is  because  there  are 

2-  llrtj  more  random  variables  of  the  same 


distribution  from  which  to  find  ; 
the  DFTR  case  than  in  the  NN 


a  minimum  which  is  less  than  zero 


case.  This  further  lead 


a  larger  difference  in  PNN  and  p°FTR 

M  M 

observations  is  increased. 


s  one  to  expect 


as  the  number  of  class  2 


Note  that  the  false  alarm  probability  for  the  DFTR  rule  can  be 
smalle1*  than  the  false  alarm  probability  for  the  NN  rule.  For  example, 
suppose  two  observations  are  available  from  each  class.  Then 


dDFTR 

PFA  =0.3333  . 


(6.88) 


If  the  class  probability  densities  are  univariate  normals  with  equal 

variances  and  with  the  distance  between  means  equal  to  the  variaac, 
Fix  and  Hodges  (1952)  obtained 


ices, 


nNN 

PFA  =  °*4086  • 


(5.89) 
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Chapter  6 


CONCLUSIONS  AND  SUGGESTIONS  FOR  FURTHER  WORK 

A  study  has  been  made  of  the  application  of  distribution-free 
tolerance  regions  to  pattern  recognition.  Some  procedures  have  been 
presented  here  for  designing  a  pattern  verification  system  with  a  given 
confidence  that  the  false  alarm  probability  will  be  less  than  a  desired 
quantity.  These  procedures  maximize  the  number  of  main  class  training 
observations  which  are  correctly  classified.  In  addition,  a  method  has 
-been  given  for  obtaining  a  measure  of  the  miss  probability.  The  procedures 
have  been  successfully  applied  to  a  speaker  verification  problem. 

The  advantage's  of  the  hypersphere  DFTR  classification  procedure 
are  the  following:  (1)  The  hypersphere  DFTR  classification  procedure 
gives  information  about  how  well  the  classifier  is  expected  to  perform. 

This  is  done  without  any  knowledge  of  the  class  probability  distributions 
and  with  only  one  sample  of  independent  observations  from  each  class. 

(2)  The  procedure  is  able  to  form  very  complicated,  unconnected  decision 
regions.  Hence  it  is  useful  when  multimodal  class  probability  distribu¬ 
tions  are  involved.  (3)  The  hypersphere  DFTR  procedure  is  very  easily 
programmed  on  a  digital  computer.  (4)  The  procedure  is  independent 
of  the  dimensionality  of  the  measurement  space.  (5)  It  offers  automatic 
reduction  of  the  data  which  must  be  stored  in  the  computer. 

It  should,  however,  be  noted  that  this  is  a  distribution-free  pro¬ 
cedure  and  hence  can  be  expected  to  be  quite  inefficient  when  compared  to 
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a  procedure  based  upon  a  priori  knowledge  of  the  class  probability  distri¬ 
butions.  The  hyperspliere  DFTR  procedure  is  most  applicable  to  the 
situation  in  which  (1)  nothing  is  known  about  the  probability  distributions 
and  (2)  information  about  the  expected  probability  of  error  is  desired 
without  using  a  test  sample.  The  only  requirements  for  using  the  hyper¬ 
sphere  DFTR  approach  are  (1)  a  properly  labeled  sample  of  independent 
observations  must  be  available  from  each  class  and  (2)  the  class  proba¬ 
bility  distributions  must  be  stationary. 

The  hypersphere  DFTR  procedures  were  applied  in  an  automatic 
speaker  verification  experiment.  Goc  d  results  were  obtained  by  the  use 
of  many  short-term  spectra  of  the  word  "my".  Error  rates  as  low  as 
1.  92%  were  obtained  when  a  256-dimensional  measurement  space  was  used. 
In  a  48-dimcnsional  measurement  space,  error  rates  as  low  as  4.  48% 
were  obtained. 

Thr  ee  different  ordering  procedures  were  developed  and  are 
described  in  detail  in  Chapter  3.  They  were  tested  on  the  speaker  veri¬ 
fication  data.  Of  the  three  methods,  the  CHS  procedure  gave  lower  error 
rates  on  the  average  than  the  OHC  procedure  and  the  OIIC  procedure  gave 
lower  errbr  rates  on  the  average  than  the  AHE  procedure. 

A  comparison  was  made  of  the  false  alarm  rate  which  was  obtained 
in  the  speaker  verification  tests  with  the  95%  upper  tolerance  level  on  the 
false  alarm  probability  which  was  predicted  with  the  DFTR  approach. 

All  test  false  alarm  rates  fell  below  the  95%  upper  tolerance  limit.  The 
average  test  false  alarm  for  the  21  different  cases  studied  here  was 
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approximately  equal  to  0.8  of  the  average  expected  false  alarm  probability 
predicted  from  the  DFTR  approach. 

A  comparison  was  made  of  the  test  miss  rate  with  a  measure  of 
the  miss  probability  that  was  obtained  by  using  a  tolerance  regions 
approach.  All  test  miss  rates  fell  below  the  95%  upper  toleranc*  limit. 

For  the  21  different  cases  studied,  the  average  miss  rate  was  equal  to 
0.94  of  the  average  expected  miss  rate  for  the  AHE  procedure.  The 
average  miss  rate  was  equal  to  0.  84  of  the  average  expected  miss  rate 
for  the  OHC-R  procedure.  The  average  expected  miss  rate  was  equal  to 
'0.68  of  the  average  expected  miss  rate  for  the  CHS-R  procedure. 

Suggestions  for  Further  Work 

There  are  many  mathematical  questions  which  were  not  resolved 

in  Chapter  5.  For  example  no  general  expression  was  obtained  for  the 

m  block  miss  probability  P  _(m)  for  the  AHE  procedure  for  n»  class  2 

M  2 

observations  and  n^  class  1  observations.  If  this  expression,  and 

similar  ones  for  the  OHC  and  CHS  procedures  could  be  obtained,  ^3^rn) 

for  the  AHE,  OHC,  and  CHS  procedures  could  be  compared.  This  would 

probably  require  assumption  of  some  underlying  probability  density 

functions.  One  would  also  like  to  compare  the  miss  probability  for  the 

m  block  AHE  procedure  with  the  miss  probability  for  the  nearest-neighbor 

rule.  Specific  questions  for  which  answers  are  needed  are 

(1)  Is  the  total  probability  of  error,  £.  P_  .  +  Pw  »  f°r  the  DFTR  rule 

1  FA  ^2  M 

ever  less  than  the  total  probability  of  error  for  the  nearest-neighbor  rule 
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when  the  a  priori  probabilities  are  £.  - -  and  £  =  -  ? 

1  nj+n2  2  V"2 

It  appears  that  the  answer  to  this  question  may  be  no.  This  follows  from 
the  fact  that  the  NN  procedure  uses  information  about  all  observations 
from  both  classes  whereas  the  DFTR  procedure  uses  information  about 
all  observations  from  class  2  but  only  information  about  those  observations 
from  class  1  which  are  used  in  forming  the  m  blocks  of  region  R^. 

The  second  question  for  which  one  would  like  an  answer  is:  What 
is  the  value  of  m  to  guarantee  that  the  miss  probability  for  the  DFTR 
procedure  is  less  than  the  miss  probability  for  the  nearest-neighbor  rule? 
The  answer  for  n^  class  1  observations  and  1  class  2  observations  has 
been  obtained.  For  this  case  m  =  2.  However,  no  result  has  been 
obtained  for  n^  >  1  .  Consider  an  example  where  two  observations  are 
available  from  each  class.  Then  the  nearest-neighbor  miss  probability 
is 

P^N=  PrfmtndX-Xjj  ,  |X-X2|)  >  min  ( |X-Yj|  ,  |X-Y.,|)]  (6.1) 

and  the  one-block  DFTR  miss  probability  is 
DFTR 

PM  (1)  =  Pr[min  (|X-Xj  ,  |X-X.,|)  >  minf^-  Yj|  , 

|Xi-Y2MX2-Y1|.|X2-Y2|)]  (6.2) 

and  the  two-block  DFTR  miss  probability  is 

pj^  11  (2)  =  Pr  [min  (  |X-  X^  ,  |  X  -  | )  >  max  [min  (jx^-  Yj  , 

IX^-Yjl),  min(|xr  Y2|  ,|X2-Y2|13  .  (6.3) 
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NN  DFTR 

It  was  previously  concluded  in  Chapter  5  that  P. .  <  P  w  (1)  .  Now 

M  M 

compare  the  NN  probability  with  two-block  AHE-DFTR  miss  probability. 

Suppose  the  terms  on  the  right  side  of  the  inequalities  of  equations  6.1 

and  6.  3  are  compared.  It  is  easily  seen  that 

00 

Pr[min(|X-Y1|,|X-Y2|)  <w]  =  ^  fjlyHy 


-CO 


{1  -  [l-F2(y  +  w)+F2(y-w)]ZJ 


(6.4) 


and 


GO 

Pr[min(|X  -Y  |,|X  -Y  |)<  w]  =  {  f2(x)dx 

-CO 

Cl  -  [1  -  F^x  +  w)  +  F^(x  -  w)]2} 


(6.5) 


These  two  terms  cannot  be  compared  without  some  knowledge  of  the 
probability  distributions.  Suppose  both  distributions  are  normal  with 
equal  variances.  Then  one  would  expect  little  difference  in  the  (6.  4)  and 
(6.5).  Likewise  one  would  expect  little  difference  in 

Pr[min(|X- Yj  ,  |X- Y2|)  <  w]  and  Pr  [min  (jx^  ,  Jx^yJ)  <  w]  . 

Since  equations  6.3  involved  max  {min  (|X^- Yj  ,  (X^-  Yj),  min(|x^-Y2|, 

.  i  i  DFTR  NN 

X  -  Y_  )J  ,  one  expects  that  P  (2)  <  P  for  two  normal  distribu- 

c.  c  M  —  M 

tions  with  equal  variances. 

Many  of  the  ordering  procedures  presented  in  Chapter  3  were  not 
tested  experimentally.  For  example,  the  ordering  procedure  which  gives 
both  the  expected  false  alarm  probability  and  the  expected  miss  probability 
(section  3.10)  was  not  tested.  The  performance  of  this  procedure  would 
be  of  considerable  interest.  In  addition,  an  experimental  comparison  of 
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the  hyperspherc  DFTR  procedures  with  classification  procedures  which 
were  not  tested  here  would  be  desirable. 


A-172 


1 


Appendix  A 


THEORY  OF  DISTRIBUTION-FREE  TOLERANCE  REGIONS 

Summary 

This  section  contains  a  discussion  of  the  theory  of  distribution- 
free  tolerance  regions  (DFTR),  especially  that  theory  which  can  be 
applied  to  pattern  recognition.  At  first,  the  discussion  is  limited  to  one 
dimensional,  continuous  probability  distribution  functions.  The  object 
is  to  make  the  statement  that  with  probability  y  (0<y<  1)  at  least 
100/3 %  (0</3<l)  of  an  unknown  probability  distribution  is  contained  in 
the  interval  between  certain  order  statistics. 

Next,  techniques  are  considered  for  the  construction  of  regions 
in  D  dimensional  space  in  which  at  least  100/3 %  of  an  unknown  D 
dimensional  distribution  is  contained  with  probability  y  .  Results  for 
discontinuous  distributions  are  then  discussed. 

A.  1,  One  Dimensional  Theory 

Suppose  X  is  a  random  variable  with  a  continuous  distribution 
function  F(x)  .  The  probability  that  X  is  less  than  or  equal  to  x  is 
denoted  by 

Pr(X  <  x)  =  /  ^  f(y)dy  =  F(x).  (A-l) 

Let  the  differential  form  of  the  above  equation, 

Pr(x-  fix  <  X  <  x)  =  dF(x)  ,  (A-2) 

be  called  the  probability  element  of  X  . 
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i 


y 


Suppose  (Xj ,  X^,  •  .  .  ,  X  )  is  a  sample  of  n  statistically  inde¬ 
pendent  observations  from  a  population  with  continuous  distribution  F(x). 
Population  is  used  here  in  the  usual  statistical  sense  to  mean  the  totality 
of  possible  outcomes  of  an  experiment.  The  probability  element  of  the 
sample  is 

n 

Pr(x  -  fix  <  X_  <  x_,  .  .  .  ,  x  -fix  <  X  <  x  )  =7T  dF(x.)  .  (A- 3) 

1  1  ~  1  1  n  n  “  n  n  .  .  1 

i=l 


Let  Y^=  F(X^),  i  =  !, .  .  .  ,  n  .  The  probability  element  of  the  is 


p  (yr  6V!  <  V  yx . V  5yn  5  V  yn»= 


TT  dy;  0  <  y;  <  1 


i=l 


otherwise  . 


Suppose  the  observations  X. ,  X,,  .  . . ,  X  are  arranged  in  order 

1  c  n 

of  increasing  magnitude.  The  ordered  observations,  X^,  X^,  ...  ,  X^ 

where  X„.  <  X/ol  <  .  .  .  <  X,  .  ,  are  called  order  statistics  of  the 
(1)-  ( Z)  ~  -  (n) 

sample.  The  intervals  (-®,  X.  J  ,  (X...,  X  ]  ,  .  .  .  ,  (X  ®)  are  called 

(1)  U)  ( l)  .  (n) 

sample  blocks.  The  random  variables  F(X  ),  F(X.  )  -  F(X.  ),  .  .  .  , 

(1/  \£)  (A; 

% 

1-F(X.  )  are  called  coverages  of  these  blocks.  Notice  that  the  coverage 

(n) 

of  a  given  block  is  the  amount  of  probability  from  the  distribution  function 
F(x)  in  that  block.  Since  F(x)  is  assumed  continuous,  Pr{X^  ^=X^} 

=  0,  i  =  2,  .  .  .  ,n  .  Therefore  the  <  sign  between  the  order  statistics 
can  be  replaced  by  the  <  sign. 

The  probability  element  of  the  ordered  random  variables 


Y  =  F(X  )  ,  i  =  1, 2, . .  .  ,n  is 
(0  UJ 
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•  •  f 


Kdy(i)dy(Z)  •  •  •  dy 


(n) 


0<-y(l)<  y(2) <  *  * '  <y(n) <  1 
otherwise  . 


(A-4) 


To  find  the  value  of  the  constant  K,  equation  (A-4)  is  integrated  over  its 
region  of  definition.  Since  F(x)  is  a  continuous  nondecreasing  function 


of  x  ,  the  y  's  and  the  x  ,'s  have  the  same  order.  Therefore 

(i)  (i) 


1  r>Yi 


L  l(n)‘"  So(2>Kdy<l)dy(2)  ”•  =  1 


(A- 5) 


0  ‘'O 
and  K  =  n  i 

Eventually  the  statement, 


Pr{[F(X(s))  -  F(X  )]  >0}  =  y  , 


(A- 6) 


is  made  where  r  and  s  are  positive  integers  with  r  <  s  <  n  . 

Therefore  the  marginal,  joint  distribution  of  F(X^)  and  F(X^)  must 

be  found.  The  probability  element  of  Y,  =  F(X.  .)  and  Y  =  F(X  ) 

y  3  (r)  (r)  (s)  (s) 

is  obtained  by  integrating  equation  (A- 5)  with  respect  to  y^, .  .  .  »y^r  ^ 

over  the  region  0  <  y^  <  . . .  <  y^j  ,  with  respect  to  y^+1j,  •  •  •  ,y^sl) 

over  the  region  y^j  <  . .  .  <  y^j  ,  and  with  respect  to  y^g+^,  •  •  •  >Y(n) 

over  the  region  y,  .<...<  y  . 

(s)  n 

Prfy,  6y,  .<  Y.  .  <  y,  .  ,  y.  6y.  .  <  Y  <  y,  - 
(r)  (r )  —  (r  Mr  Ms  Ms  -  (s  Ms 


n!dy(r)dy(s)S.;--  S 


1 

dy(s-l)  S, 


\r ) 


n!dy(r)dy(s) 


y(s+2),  J 

dy(s+l)*  •  •  dy 

(n)  5 

•y(s) 

... 

ry(r+ 2) 

y(s) 

V(r) 

Ay(r) 

-y(2) 

dy(ir--dy(r-D 

= 

r-lr  tS 

T  y(r)  ly(s)-  y(r) 

-  r-1 

1 

CD 

jn-s 
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0  < 

Y(r)< 

y ,  y.  <  1 

(s) 

dY< 


(r+1) 


(A- 7) 


The  probability  element  of  the  random  variable  ,  Y.  -  Y .  .  , 

(s)  (r) 

is  desired.  Therefore  let 


W  =  Y  -  Y 

(s)  (r) 


and 


Z  =  Y 


(r)  * 


The  Jacobian  of  the  transformation  is  one.  Hence,  probability  element 
of  W  and  Z  is 


nl  dz  dw  z 


r-1 


( r - 1 )  1  (s-r-1)!  (n-s)l 


s-r-lr,  ,n-s 
w  [1-w-zJ 


0<z<w+z<l  . 


Integrating  z  over  the  range  [0,1-w]  ,  the  marginal  distribution  of 


W  =  F(X,  ,)  -  F(X  )  is  obtained, 
(s)  (r) 


i  s-r“l , 
n!  w  dw 


1-w 


Pr  (w-6w  <  W  <  w)  = 


(r-l)l  ( s - r - 1) I  (n-s) 


-  ^  zr”V-w-z)n  S  dz 

■  Jn 


(A-8) 


Letting  z  =  (l-w)t  ,  this  equation  reduces  to 


s- r-1 


...  r  „r  n!w  dw(l-w)  f*  ,,  .r-1  M  vn-s 

Prfw-6w<  W  <w)  =  V  d-w)  d-w) 


* 


(l-t)n"S  tr"1  dt  . 


But 


si 


„  ..n-s  .r-1  (n-s)!  (r-1)! 

(1-‘>  4  dt=  (n-s+r): 


Therefore 

Pr(w  -  6w  <  W  <w}  = 


n: 


(s-r-1)!  (n-s  +  r)l 


s-r-1  n-s  +  r 
w  (1-w)  dw 


0  <  w  <  1 


(A-9) 
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I 


Equation  (A- 9)  is  recognized  as  the  Beta  distribution. 

The  probability  that  F(X  )  -  F(X  )  is  greater  than  or  equal 

Is)  (r) 

to  B  is  given  by 


Pr[[F<X(6))  -  F(X)(r))]  >  0}  =  J  BL 


s-r-1  .n-s  +  r 
w _ (1-w) _ dw 

(s-r-l)l  (n-s  +  r)l 


=  1  -  I0(s-r,  n-s  +  r  +  1 ) 
P 


(A-10) 


The  function  I^(p,q)  is  called  the  Incomplete  Beta  Function  and  its  values 

are  tabulated  in  the  literature  [Pearson  (1934)].  Notice  that  s-r  appears 

symmetrically  in  the  above  equation.  Let  m  =  n+  l-(s-r)  be  the 

number  of  intervals  which  are  excluded  from  the  region  in  which  we 

want  to  contain  at  least  /3  of  the  population.  Then  Pr[[F(X  )-F(X  )]>/3] 

(s)  vrJ 

V 

=  y  is  given  by 


1 

■L 


ni 


0 


(n-rh)i  (m-l)i 


n-m  ..  .in-1  , 

w  •  (1-w)  dw 


I.(n+l-m,  m) 
P 


(A- 1 1 ) 


Given  three  of  the  four  variables  n,m,$ ,y  one  can  solve  for  the 
fourth.  Murphy  (1948)  has  constructed  graphs  of  /3  versus  n(n  <  500) 
for  various  m  and  for  confidences  y=  0.90,  0.  95,  0.99.  Sommerville 
(1958)  has  tabulated  m  for  values  of  n  =  50  to  n  =  1000;  /J  =  0.  50, 

0.  75,  0.90,  0.  95,  0.99;  and  y  =  0.  50,  0.75,  0.90,  0.95,  0.99.  If 
these  graphs  and  tables  do  not  contain  the  desired  values  for  n,  m, 
or  y,  one  can  calculate  n  by  the  following  approximation  due  to  Scheffe 
and  Tukey  (1945), 
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(A- 12) 


n~tiXLiX.y(1+«/(1^,+i(m-1,l 


4^ 

where  X  ^  ^  if;  the  point  exceeded  with  probability  1  -  y  for  the  Chi- 
squa. cd  distribution  with  2m  degrees  of  freedom.  For  y  and  /3  in 
the  range  (0.  9,  1.0)  the  approximation  error  is  generally  less  than  one 
tenth  of  one  percent.  If  one  wishes  to  calculate  /J  for  a  desired  y,  m, 
and  large  n  the  following  approximation  may  be  used. 


4 


2  2  2 
(X  -  2m  +  I6n(n-m))  -  (X  -  2m) 

2m;l-y _  2m;l-y _ _ 


4n 


(A-13) 


A.  2.  Generalizations 

As  one  might  expect,  the  sum  of  any  m  of  the  n+1  blocks 
determined  by  the  ordered  observations  gives  the  result  (A-ll).  This 
is  easily  seen  by  examining  the  distribution  of  the  coverages.  From 
equation  (A-4),  the  probability  element  of  =F(X^)  ,  i  =  l,  ...  ,n  is 


n;dy(l)dy(Z)  •••  dy(n) 


0<yU)<-"  <y(n)<1 
otherwise  . 


Let  the  cove  rage  s  be  denoted  by  U. ,  i  =  1,  .  .  .  ,  n  +  1  .  Let  u. ,  i  =  1,  .  .  .  ,  n  +  1 


be  the  variates  corresponding  to  the  random  variables  IL,  i  =  1,  .  .  .  ,  n  +  1 


Let  u  =  yn >,  u  =  y,_.-  yn.,  .  .  .  ,u.=  y.  -  y.. 


1  (1)  2  ’(Z)  '(1) 


i  7{i)  7(i-l)"  ’  *  'Un  "  Y(n)‘  Y(n-1) 


and 


u  =  1-u,  -  .  .  .  -u  .  Since  the  magnitude  of  the  Jacobian  of  the  trans- 
n-fi  n 

formation  from  yni,  .  .  .  ,y,  .  to  u,,...,u  is  one,  the  probability 

(1)  (n)  In 

clement  of  the  coverages  is 
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nl  du,du_.  . .  du 
12  n 


0  <  u.,i  =  1, . . .  ,n;  £  u.=  1 

i  =  1  1 

otherwise  % 


(A-14) 


Equation  (A-14)  is  completely  symmetrical  with  respect  to  the  ^  . 

This  means  that  coverage  of  one  block  has  the  same  properties  as  the 

coverage  of  any  other  block.  Because  of  the  symmetry,  the  distribution 

of  the  sum  of  any  t  coverages  is  the  same  as  the  distribution  of  the 

sum  of  the  first  t  coverages.  The  sum  of  the  first  t  coverages  is 
t 

£  U.=  F(X,  J  =Y.%  .  Therefore,  the  probability  element  of  the  sum  of 

i=1  i  (v)  (t) 

any  t  coverages  is  given  by 


n!dy(t)L(t>"-  S„<2)  %)•••%-!) 


(n-l) 


nl  t**l  XI  t 

dy(n)*  *  ’  dy(t+lf  (t-l)l  (n- r)i  Y(t)  (1“V(t))  dy(t) 


(A-15) 


Let  m  =  n  +  l-t  ,  be  the  number  of  blocks  which  are  eliminated  from 
the  interval  of  interest,  and  denote  by  w  .  The  probability 

element  of  the  coverage  of  the  intervals  between  any  n  +  l-m  order 
statistics  is  given  by 


(n-m)!  (m-l)i 


n-m  ,,  ,m-l  , 
w  (1-w)  dw  . 


0  <  w  <  1 


(A-16) 


Integrating  equation  (A-16)  over  [/S,l]  one  obtains  equation  (A-ll). 

The  idea  of  coverage  is  more  general  than  it  first  appears, 


Let  the  observations  again  be  labeled  X.  ,  X  ,  ...,X  where 

(1)  (^)  (n) 
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The  blocks  need  not  necessarily  be  defined 


X(l)  <  X(2)  <  <X(n)  • 

as  (-»,  X^]  #  ^(i)'  ^(2)1  *  •  •  •  »  *  ^or  example  consider  the 

Mocks  tf.X(j)](K(j).  X(j+1)] . (X(n  l),  X(n)],  {(X(n).-)U(-.X(1)]} 

<v  V----  ,  (X(j  jj]  »  (X^  f  )  where  f  is  some  number 

on  the  *eal  line  and  j  is  a  positive  integer  with  j  <  n  .  These  blocks 
are  formed  by  ordering  the  observations  in  the  following  manner.  See 
Figure  A-l.  The  blocks  are  numbered  O  -  ©  ,  .  .  .  in  the  order 

that  they  are  formed.  The  first  block  is  formed  by  searching  for  the 
observation  whose  value  is  closest  to  f  while  being  greater  than  f  . 

The  second  block  is  formed  by  searching  for  the  second  closest  observa¬ 
tion  to  f  which  is  also  greater  than  f  .  The  procedure  is  continued 

until  the  largest  observation  X,  .  is  found.  A  search  is  made  for  an 

(n) 

observation  greater  than  X  and  noni  is  found.  To  complete  the 

(n) 

block,  a  search  is  made  from  minus  infinity  for  the  smallest  observation. 

When  this  observation  is  found,  the  block  (X  , 00 )  U  (-“,X  )  is  com- 

(n)  (1) 

Pi  eted.  The  last  j-2  blocks  are  formed  by  searching  for  subsequently 
larger  observations.  The  (nil)**1  block  is  (X^  •  Let  the  cover¬ 

ages  be  defined  as  follows 

ur  F<V  ’ F(n>  V  F(x(j+D)  •  F(xu>’ . un-j+i 

=  l-F(X(n))  +  F(X(1)) . V  F(X(H))  -  f(x(._2))  . 

The  Jacobian  of  the  transformation  from  the  random  variables 


F(X._ . F(X  )  to  the  coverages  U.,...,U  is  one.  Hence,  the 
(1)  (n)  In 

probability  element  of  the  coverages  is 


A-180 


regions. 

It  should  be  noted  that  only  order  statistics  yield  distribution- 
free  tolerance  regions.  See  Robbin  (1944). 

Some  warning  seems  appropriate  on  the  selection  of  intervals  in 
which  ft  of  the  distribution  is  to  be  contained  with  probability  y  .  The 
ordering  functions  and  the  blocks  should,  in  general,  be  prescribed 
before  the  sample  is  taken.  For  example,  suppose  that  one  observation, 
X^  ,  is  taken  from  the  population  described  by  F(x)  on  the  real  line. 

If  the  location  of  is  not  known,  one  can  search  for  X^  from  any 
point  f  as  before  and  have  no  reason  to  believe  that  X^  will  be  found 
before  one-half  the  probability  measure  has  been  covered.  However, 
if  the  location  of  X^  is  known  and  if  an  ordering  procedure  is  devised 
so  a  "search"  is  made  toward  X^  from  a  point  which  is  a  very  small 
distance  to  the  left  of  X^  ,  the  amount  of  probability  in  ( f,  X^)  can  be 
made  much  less  than  the  amount  of  probability  in  the  other  "block" 

(-«>,  t)  U  (X  ,«)  .  Hence  statistically  equivalent  blocks  are  not  formed 
in  this  example.  The  blocks  can  be  chosen,  however,  on  the  basis  of  a 
previous  sample  when  the  results  are  to  be  applied  to  a  future  sample. 

It  is  seen  later  that  subsequent  ordering  functions  may  depend  on  the 
location  of  the  observations  that  have  previously  been  ordered.  But  the 
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ordering  functions  should  not,  in  general,  depend  on  the  location  of  any 
unordered  observations. 

A.  3.  D-Dimensional  Theory 

Thus  far,  only  one-dimensional,  continuous  distribution  functions 
have  been  discussed.  Multi- dimen  ;ional  distribution-free  tolerance 
regions  are  defined  as  follows: 

Definition:  Suppose  a  sample  of  size  n  is  drawn  from  a  continuous, 
D-dimensional  distribution  function  F(x^,  x^, . . .  ,  x^)  .  Region  R  is  a 
D-dimensional  distribution-free  tolerance  region  if  the  amount  of  prob¬ 
ability  from  the  distribution  Ffx^x^,  .  . . ,x^)  in  region  R  does  not 
depend  on  F(x^,  . . . , x^)  . 

Suppose  and  are  two  random  variables  which  are 

described  by  the  bivariate  distribution  function  F(Xj,  x^)  .  Our  task  is 
to  construct  distribution-free  tolerance  regions  in  space  {x^x^l  . 

Suppose  a  sample  of  n  observations  is  available  from  Ffx^x^)  .  It  is 
evident  from  the  one -dimensional  study  that  n  lines,  each  of  which 
intersects  a  different  observation  and  lies  parallel  to  the  x^  axis, 
divide  the  space  into  statistically  equivalent  blocks.  The  same  is  true, 
of  course,  for  n  lines  passing  through  each  observation  and  being 
parallel  to  the  x^  axis. 

A  more  general  ordering  for  producing  distribution-free  tolerance 
regions  also  becomes  evident.  Suppose  the  observations  are  ordered 

X^)  .  As  long  as  the  distribution 


with  the  function 


h(x1, 


x2) 


Let  V  ^  h(X 


r 
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function  of  V,  F^(v)  ,  is  continuous,  it  is  seen  from  the  onc-dimensional 

derivation  that  the  amount  of  probability  between  the  order  statistics 

V  ,  V  .  .  . ,  V  is  independent  of  F,r(v)  .  Hence,  the  coverages 
(1)  (2)  (n)  V 

defined  by  the  n  identical  curves,  each  one  passing  through  one  of  the 
n  observations,  are  distribution-free.  These  methods  for  ordering  the 
observations  are  easily  extended  for  distribution  functions  of  more  than 
two  variates. 


A. 4.  Wald's  Ordering 

A  more  general  ordering  was  proposed  by  Wald  (1943).  Let 
(X^,  X^,  .  .  .  ,  X^)  be  a  set  of  D  random  variables  with  continuous  prob¬ 
ability  density  F(x^,x^,  .  .  .  ,  x^)  •  Take  a  sample  of  n  independent 

observations  and  denote  the  a**1  observation  of  X.  by  X.  (i  =  1,  .  .  .  ,  D;a  = 

x  ia 

1,  .  .  .  ,n)  .  The  problem  is  to  construct  D  pairs  of  functions,  F.  (X^,  .  .  .  ,  X^) 

and  M.  (X  ,  .  .  .  ,  X  ),  i  =  1 ,  .  .  .  ,  D  ,  so  that  the  distribution  of  the  statistic 
i  11  Dn 
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is  independent  of  f(x^,  .  .  .  ,  x^)  .  The  following  construction  procedure 

satisfies  this  requirement.  Let  the  X  be  arranged  in  order  of 

increasing  magnitude,  X,  X. .  <  X,  .  .  .  Choose  L  =  X, ,  , 

1(1)  1(2)  l(n)  1  l(rx) 

and  M  =  X,  .  )  where  r,  and  s,  are  positive  integers  with  r_  <  s,  <  n  . 
1  lis,  1  1  lx  — 


1 


Next  consider  only  the  observations  for  which  X,.  .  <  X, .  <  X, .  .  . 

l(rx)  lj  l(s  ) 

Arrange  these  observations  in  order  of  increasing  magnitude  of  the  second 


coordinate,  X'  <  X'  _4  <  .  .  .  <  X'  .  The  prime  distinguishes 

2(1)  2(2)  2(s1“r1-l) 
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variate  example  is  shown  in  Figure  A-2a  with  r  =  2,  s  =  8,  r  -1,  s  =  5. 

X  1  X  fa 

The  key  to  Wald's  successful  ordering,  as  will  be  seen  later,  is 

the  successive  elimination  of  blocks  which  have  been  formed.  Suppose 

rather  than  use  Wald's  ordering,  we  let  L.=  X..  .  and  M.=  X..  ., 

i  lfr^  l  i(s.) 

i  =  1, .  . .  ,D  where  r^  and  s^  denote  positive  integers  with  r^<  s.  <  n  . 

•Therefore  for  i  =  2, .  . . ,  D  we  do  not  eliminate  from  subsequent  ordering 

those  observations  which  have  previously  been  ordered.  This  ordering 

does  not  yield  a  region  whose  coverage  is  independent  of  the  distribution 

F{x^, .  ..fx^)  for  all  r.  <  <  n  when  F(x^,  x^, .  .  .  ,  x^)  i  F^(x^)F2(x2) 

.  .  .  F^fx^)  .  For  example,  the  crosshatched  region  of  Figure  A-2b  is 

not  a  distribution-free  tolerance  region  when  the  random  variables  X^ 

and  X^  are  statistically  dependent. 

However,  if  the  random  variables  X^,  X^, ...»  X^  are  statistically 

independent,  each  variable  can  be  ordered  separately.  Then  the  coverage 

of  the  region  defined  by  X..  .  and  X..  .,i  =  1, .  . .  ,  D  is  given  by  the 

i  '  i; 

product  of  the  coverages  of  each  variable.  For  example,  consider  a 


bivariate  distribution  F(x^,  x^)  which  satisfies  the  relationship  Ffx^.x^) 

=  W  W  •  “  U  =  Fl<XX(Sl)  -  Fl(Xl(r1)»  “d  V  =  F2<X2(s2)>  - 

*2^2(r  *be  Probability  distribution  of  W  =  U  •  V  can  be  calculated 
2 

by  integrating 
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1  w/v 


So  S0  {.1-r1-l)!(n..  +r1)!(.  -Vl)!(n- 


s2+r2)! 


u 


vv1 


b2-tz-1  n-Bj+rj  n-<>2+r2 

v  (1-u)  (1-v)  du  dv  . 


(A-19) 


Now  consider  Wald's  proof  that  his  ordering  scheme  produces 
distribution-free  tolerance  regions.  Consider  the  bivariate  case  where 
the  random  variables  are  statistically  dependent  and  the  joint  probability 
density  function  is  continuous.  The  object  is  to  show  that  the  distribution 
of  the  statistic 


(A-20) 


where  L,  M,  Ly,  and  M  are  given  by  Wald's  ordering,  is  independent 

lib  £ 

of  the  probability  density  function,  f(x^,  x^). 

Make  the  following  definitions: 


(A-21) 


(A-22) 


( 


P  is  the  probability  that  lies  between  and  M^.  P  is  the  prob¬ 
ability  that  X^  lies  between  and  given  that  X^  lies  between 

L.^  and  M^.  It  is  evident  that 


W 
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Q  =  PP 


(A-23) 


If  Lj  =  j  and  =  X^g  ^  where  and  s^  are  positive  integers 

with  r^  <  s^  <  n  ,  it  is  clearly  seen  from  the  one -dimensional  discussion 
that  the  probability  element  of  P  is  given  by 


n! 


^W1)1  *n-si +  ri^ 


w1  n“8i+ri 

P  (1-P)  dP 


(A-24) 


Let  L0  =  X'  .  and  =  X'  .  as  defined  in  Wald's  construction 

2  2(r2)  2  2(s2) 

procedure.  Since  X*  X'  can  be  considered  as  s_-r,-l 

2(1)’  2(s1-r1-l)  1  1 

independent  observations  on  random  variable  X2  under  the  condition 

that  L^  <  X1  <  ,  the  probability  element  of  P  is  given  by 


(srrrD! 

<W1)!  (sr rrSz  +  r2-l )!  (P) 


_  vv1  -  Srrr1"s2+rz  - 

(1-P)  dP  (A-25) 


Note  that  equation  (A-25)  does  not  involve  and  .  Hence,  the 
joint  probability  element  of  P  and  P  is  given  by  the  product  of  equations 
(A-24)  and  (A-25). 


nl 


(n-sx+  rj)!  (s2-r2-l)I  (s1-r1-s2+  ^-1)! 


s-r-l  n-s+r 

P  (1-P) 


-W1  -Vrr1-s2+r2 

(P)  (1-P)  dPdP 


(A-26) 


Then  the  joint  probability  of  P  and  Q  =  PP  is  given  by 


l 
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where  K  is  the  multiplicative  constant  of  equation  A-26.  By  integrating 
P  over  the  interval  [Q,l]  we  obtain 


8.,-r  -1  1  n-s,+r,  s  -r.-l-s.+r^ 

KdQ  Q  2  2  [  (1-P)  1  1  (P-Q)  11  2  2  dP  . 

JQ 


Let  '  R  =  P-Q.  The  value  of  the  integral  is  then 


1  n-s  +r  s  -r  -l-s_+r  1-Q  n-s.+r, 

[  (1-P)  1  1  (P-Q)  11  2  2  dP  =  C  (1-Q-R)  1  1 

JQ  JQ 


s  -r  -1-s  +r 
R  dR  . 


Let  R  =  (l-Q)T  .  Then  the  above  equation  reduces  to 


n-l-s  +r  1  n-s  +r  s  -r  -1-s  +r 

(1-Q)  (1-Q)  £  (1-T)  T  dT 

J0 


Integrating  with  respect  to  T  one  finds  that  the  probability  element  of 
Q  is 


n! 


(s2-r2-l)i  (n-s  +r2)i 


W1  n'S2+r2 

Q  (1-Q)  dQ  . 


(A- 


Sincc  equation  A- 27  is  independent  of  f(x^,x^)  ,  Wald's  method  of 
ordering  produces  distribution-free  tolerance  regions. 

It  is  convenient  to  think  of  Wald's  ordering  as  eliminating  block: 


from  the  region  of  interest.  The  Is*  ordering  (X,.  .  and  X,.  .) 

l.Uj)  l(sx) 


27) 


and 


eliminates  the  r.  blocks  whose  X,  coordinate  is  less  than  X,,  . 

1  1  lfrj) 


The 


the  n+l-s.  blocks  whose  X.  coordinate  is  greater  than  X.. 

l  1  l(sj) 

2nc*  ordering  eliminates  r^  blocks  and  s^-r^-s^  blocks  for  a  total 
elimination  of  n+l-s^+r^  blocks.  By  substituting  m  =  n+1  -  s^+r^  *n*° 
equation  A- 27  one  obtains  equation  A-16,  the  equation  for  the  probability 


element  for  the  one -dimensional  case. 


A.  5.  A  General  Ordering  Procedure 

Consider  a  more  general  method  for  ordering  the  observations. 
Assume  that  a  sample  (X^,  X^  ;  CC  =  1, .  • .  ,  n)  is  available  from  a  con¬ 
tinuous  two-dimensional  distribution  F(Xj,x^).  Let  w^=  h^fx^,  X2^' 
k  =  1, ..  .,n  be  n  functions,  possibly  alike,  possibly  distinct,  such  that 
h^(X^,  X^),  .  .  .  ,  h^(X^,  X^)  are  random  variables  with  a  continuous  joint 
distribution  function.  Since  F(x^,x^)  is  assumed  to  be  continuous, 
the  functions  w^»k  =  1, . . .  ,n  are  continuous  almost  everywhere.  These 
functions  are  called  ordering  functions.  They  are  used  to  form  distribution- 
free  tolerance  regions  in  the  following  manner.  The  first  ordering  function 
hjfx^x^)  is  used  to  select  the  observation  (X^jyX^^)  which  satisfies 


max  x2a)  =  k1(X1(1),  X^)  . 

0»— 1 , ...  ,  n 


(A-28) 


Let  the  region  of  the  sample  space  for  which  h^(x^,  x^)  >  hj(X^^,X2^j 
be  called  block  B^,  and  the  region  of  the  sample  space  for  which 
^1^X1’X2^  =  ^l^l(l)’  ^2(1)^  called  Tj  .  Then 
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B1  =  {^Vx2*  :  Vxl,x2*  >hl^Xl(l),X2(l))  } 

and 

T1  =  {^V  X2^  :  *VX1,X2^  =  hl^Xl(l)' X2(l)^  }  ’  (A-29) 


Eliminate  the  block  and  the  cut  from  the  sample  space.  Note 

that  the  observation  (X  .  .,X  )  is  also  eliminated.  Select  from  the 

1(1)  2(1) 

remaining  n-1  observations  the  one  which  satisfies 


max  =  h2(Xl(2)’X2(2))  * 

a=l,...,n  1  1  v  ' 

OLt  (1) 


(A-30) 


Let  the  region  of  the  sample  space  for  which  h^fx^x^)  >  ^2^1(2)’ X2(2)^ 
and  hj(x^,  x^)  <  h|(Xj^,  X^j)  he  called  block  and  the  region  for 

which  h2^xi*  =  h2^Xl(2)' X2(2)^  and  ^(x^  x^)  <  h^X^j,  be  called 


cut  .  Then 


and 


Br  {^V3^  :  h2^xl,x2^  >  h2^Xl(2)' X2(2)^  hl^xl,x2^ 

hl(Xl(l)’X2(l))  } 

Tl=  {^V3^  :  h2^Xl,X2^  =  h2^Xl(2)' X2(2)^  '  hl(xl’x2>  < 

VX1(1)'X2(1))}  (A"31) 


This  procedure  is  continued  until  n  blocks  are  eliminated  from  the 
sample  space.  The  sample  space  is  thereby  partitioned  into  n  +  1 
mutually  exclusive  and  exhaustive  blocks  by  the  n  cuts. 

It  is  now  shown  that  this  procedure  produces  distribution-free 
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tolerance  regions.  Denote  the  portion  of  the  distribution  contained  in 
block  (i.e.  the  coverage  of  B^)  by  U.*,  i  =  l,...,n  .  To  prove 
that  the  coverages  IL*,  i  =  1,  . . .  ,  n  are  distribution-free  it  is  sufficient 
to  prove  that  the  probability  density  of  the  coverages  is  uniform.  The 
proof  given  here  follows  the  one  by  Wilks  (1962).  Consider  two  random 
variables  and  with  distribution  F(Xj,x2).  The  probability 
element  for  sample  (X^#  X2  ;  a  =  1, . . . , n)  is 


TT  dF<’Vx2a> 

a=l 


A-32) 


Let  (X  ,  X  )  be  the  observation  which  yields  the  largest  value  for 
1(1)  2(1) 

h^(x^,  x^).  Let  (X  X2  Ot  =  1, . . .  ,  n)  be  the  set  of  observations 

obtained  by  deleting  (X  .  .,X  .  )  from  the  sample  space.  The  probability 

1(1)  2(1) 

element  of  (X^^,  X2(l)>  and  9  a  =  •  •  • »  n_1)  is 


U-i 

ndF(xl(l)’X2(l)>  T  dF<*la 


(1)  x  (1), 

'2a  1 


(A-33) 


Let  UJ  be  the  coverage  associated  with  the  set  of  points  (x^,  x^)  for 
which  h^(Xj,  x^)  >  h^(Xj^,  x2(l))  *  The  probability  element  of  U|  is 


n(l-uj)n~*  duj 


(A- 34) 


where 


u1  =  ^  dF(x1,x2)  and  duj  =  dFfx^,  x2^)  . 


The  probability  element  of  the  conditional  random  variable 
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(X 


la 


0).  X,  «/ 


2a  /  Xi(i),X2(i)’  a  =  •  •  •  *n)  given  by  the  ratio  of  equation 


A- 3 3  to  equation  A- 34.  This  ratio  reduces  to 


f**m  w, 

a=l 


2a 


where 


(1)  (1) 


(x 


(1), 


la 


x2a  >  = 


F(xla(1)'  xza(1)> 


1  -u' 


Therefore  given  (X  ,  X2(i)^  =  ^(l)’  X2(l)^  the  remainin8 
observations  of  the  original  sample  behave  like  a  sample  of  n-1  observa¬ 


tions  from  the  distribution 


Fd)  (3,  x  , .  F(xi’x^ 

F  (x1,x2)  -  l  u. 


(A-  36) 


Now  let  (X  ,  X  )  be  the  observation  among  the  remaining 

l(fc)  2(2) 

n-1  observations  which  yields  the  largest  value  for  h^x^,  x^)  .  Let 
(X  X  ^  a  =  1,  . .  .  ,  n-2)  be  the  set  of  n-2  observations  obtained 

lCz  C»  Cl 

by  deleting  (X  ,  X  )  from  the  n-1  observations.  The  probability 
element  of  (x1^2 ) ’  X2(2 and  ;  a  =  1, .  .  .  ,  n-2)  is 


(n-1)  dF  ^xi(2)’  X2(2)^  dF  ^*1 


(2)  x  (2)) 

La  ’  2a  }  * 


(A- 37) 


Let  be  the  region  for  which  h2(x^,  x^)  >  h^fX^^,  X,,^)  anc* 


hl^Xl’X2^  <  hl^Xl(l)X2(l)^  '  Let 


dF^  (x^x^ 


(A-38) 
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be  the  conditional  coverage  associated  with  .  Then  the  probability 


element  of  is  given  by 


(n-l)  (1-u/-2  du^  . 


(A-  39) 


(2)  (2) 

The  probability  element  of  (X  ,  X  a  =  1, .  . .  ,  n-2)  is 


ttV>  (xj2> 

OF  1 


x  (2)) 
2a  ] 


(A-40) 


where 


F<2>  (.  <2>  x  (2) 

’  Za  ' 


(1>  (v  <2>  x  (2)) 

(xltt  '  xZa  ' 
1-ui 


Therefore  givm  (x^^)»  x2(2)^  *  rema^n^n8  n-2  observations 
behave  like  a  sample  from 


u »  ^  (x« »  x_ ) 

F'^  (x  x  )  =  - - - 

*X1#  XZ  l-ul 


(A-41) 


Continuing  in  the  above  manner,  one  concludes  that  the  conditional 


coverages  U',  U',...,U'  have  the  probability  element 
1  4  n 


n!  (l-uJ)”"1  (l-uL)n"2  .  .  .  (1-u'  )l  dul  ...  du' 
1  c  n  1  n 


(A-42) 


The  conditional  coverages  (U!  ;  i  =  1,  . .  .  ,n)  are  related  to  the  coverages 
(IT*  ;  i  =  1, . . .  ,n)  by  the  following  expressions:  . 

Uj  = 


Ui  = 


V 


2  1-U* 


U 


U' 

n 


n 


l-U*-...-U  * 

1  n-l 


(A-43) 
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Rewriting  equation  A-42  in  terms  of  the  one  finds  that  the 

probability  element  of  the  coverages  is 


nl  du  *  ...  du  *  . 
1  n 


(A-44) 


Since  this  equation  is  the  same  as  equation  A-ll,  the  proof  is  completed. 

The  procedure  for  forming  distribution-free  tolerance  regions 

has  been  further  generalized  to  permit  the  subsequent  ordering  functions 

to  depend  in  any  way  on  the  information  gained  in  the  application  of 

previous  ordering  functions.  See  Fraser  (1953),  Kemperman  (1956). 

The  following  procedure  is  one  example  of  a  particularly  general 

ordering.  See  Fraser  (1957).  Let  h^(x), .  .  .  ,  h^(x)  be  n  real-valued 

measurable  functions  of  x  =  (x, ,  .  .  .  ,  x_ ).  Let  max  (r.)h.(X.)  denote 

1  D  .  „  J  J-i 

i — x  |  •  •  •  |  n 

the  r**1  largest  value  of  h..  Note  that  r.  must  be  an  integer  which  is 
J  J  J 

less  than  n  .  Let  the  observations  be  ordered  as  follows.  First,  locate 

lb 

the  observation  which  gives  the  rj  largest  value  for  h^.  Denote  this 
observation  by  X_  .  Then 


\(X  )  =  max  (r^)  ^  PL) 

'  i=l, .  .  . ,  n 


1  <  ri  <  ni 


(A-45) 


The  sample  space  is  partitioned  into  two  subspaces, 


Sl. . .  r  =  =  hl(^  >  h<2<l)>  } 


(A-46a) 


and 


^+1.  .  .  n  +1  {-  :  hl(-)  <  hl(~  (1)}  }  ’ 

by  means  of  the  cut 

T,  =  {x  =  hl(x)  =  YX  >  }  . 


(A-46b) 


(A-46c ) 
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The  observation  la  eliminated  from  any  further  ordering.  The 

two  sub  space  s  S,  and  S  .  are  treated  separately  in  further 

l*  *  •  r^  r^+l.  * .  n+l 

ordering.  There  are  r^-1  observations  in  subspace  S^...r^  and  n-r^ 

observations  in  subspace  S  .  Suppose  next  we  locate  the 

r jTx#  •  •  n*r i 

observation  which  gives  the  r*h  largest  value  of  lu(x)  in  S.  where 

L  c  ~  !•  •  •  r1 

1  <  r^  <  rj-l.  Denote  this  observation  by  X^)  *  Then 


h2(^<2)>  =  „  m*X  (r2)h2<^i) 


x.cS 


‘i'zSV1  • 


(A-47) 


i  1. . .  r. 


(Note  that  h,(x)  can  depend  on  X.  .  )  The  subspace  S  is  parti- 

c  —  —  (1)  1.  . .  r. 


tioned  into  sub  spaces 


Sl. . .  r,=  {=  =  h2W  <  V*  (2)>  -  hl«  >  h<*  (!))  } 


(DJ 


and 


Sr2+ 1. .  .rj  Lc:h2(x)  >  h2(X  hh^x)  >  h^X^  )  j 
by  means  of  cut 

T,  =  {x:h2(x)  =  h2(X  )  ,  hj(x)  >hj(X  )  }  _ 


(A-48a) 


(A-48b) 


(A-48c) 


The  observation  X^  *8  eliminated  from  further  ordering.  Next  we 

1. .  .  r 


(2) 

order  in  one  of  the  three  sub  spaces  S,  _  ,  S_  , ,  ,  or  S 


r_  +1. . .  r, 
c  c  1 

This  procedure  is  continued  until  i.+l  blocks  are  formed. 


r^  + 1. . .  n-rl 


The  two-dimensional  example  shown  in  Figure  A- 3  illustrates 
this  ordering  procedure.  The  coordinates  of  the  space  are  labeled  x, 
and  x2  and  arrows  show  the  direction  of  increasing  magnitude.  The 
three  observations  are  labeled  with  X's  .  The  parameters  for  the  first 
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ordering  are  h^(x)  =  and  p^=  2  .  Thia  divides  the  space  into 

and  S34  .  The  parameters  h^fx)  =  |x  -  X  ^  |  and  p^=  1  are  used  to 

divide  into  and  .  The  parameters  h3(x)  =  x^  and  p^=  1 

are  used  to  divide  S. A  into  S0  and  S.  .  A  total  of  four  blocks  S_,  S_, 

34  3  4  1  2 

S^,  and  result. 

A.  6.  Discontinuities 

Distribution-free  tolerance  regions  can  be  constructed  when  the 
original  distribution  has  a  countable  number  of  discontinuities.  In 
dealing  with  discontinuous  distributions,  problems  arise  from  the  finite 
probabilities  associated  with  the  cuts.  In  addition,  the  construction  pro¬ 
cedure  must  incorporate  a  method  for  handling  ties. 

It  has  been  shown  in  Scheffe  and  Tukey  (1945),  Tukey  (1948),  and 
Fraser  and  Wormleighton  (1951)  that  if  all  cuts  adjacent  to  the  blocks  of 
interest  are  included  in  R  ,  the  region  of  interest,  the  following  statement 
can  be  made 


Pr 


(A-49) 


where  8  and  y  are  determined  from  the  number  of  blocks  contained 
in  R.  For  example  if  in  Figure  A-3  is  the  region  of  interest,  the 

*  xi 

be  included  in  R  so  that  statement  (A-49)  can  be  made. 


cut  T 


f  t 


-id)} 


and  the  cut 


V  fe:X2  =  *2<3)’  X 


1  -  -1(1)1 


must 
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A.  7.  Other  Extensions 


A  tolerance  region  theory  which  uses  statistically  equivalent 
blocks  has  been  developed  for  cases  in  which  the  class  of  probability 
distribution  functions  is  limited.  As  might  be  expected,  when  the  kind  of 
distribution  function  unde r  consideration  is  restricted,  more  blc  jks  can 
be  eliminated  from  the  region  of  interest  for  the  same  y  and  j9  than 
can  be  eliminated  when  nothing  is  known  about  the  distribution.  A  fre- 

f(x) 

quently  considered  class  of  distributions  is  the  class  for  which  - — — — -  > 

’  l-F(x) 

the  hazard  rate,  is  a  monotone  function.  See  Hanson  and  Koopmans  (1964) 
and  Barlow  and  Proschan  (1966)  for  further  information. 
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Appendix  B 


A-I99 


One  can  find  values  of  j3  in  tables  of  50  percentage  points  of  the  Beta 
function  or  from  tables  of  the  cummulative  Binomial  distribution  by  using 
the  relation 

J  =  £  (">  (1-|3)V‘S  . 

s-m 

For  values  of  n-m  greater  than  m-1  ,  the  mean  is  less  than 
the  median  which  is  less  than  the  mode.  For  example,  consider  the 
following  table. 


Table  B-l.  Comparison  of  the  Mean,  Median  and  Mode 
for  Two  Different  Values  of  m  and  n, 


Conditions 

Mean 

Median 

Mode 

n=65,  m=6 

0.90909 

0.91321 

0.92188 

n=60,  m=l 

0.98361 

0. 98851 

1.00000 

Appendix  C 


Let  be  a  distribution-free  tolerance  region  on  the  continuous 

cumulative  probability  distribution  function  F^(x)  .  The  purpose  of  this 
appendix  is  to  demonstrate  that  the  expected  value  of  \  dF(x)  , 

Jr>  A 


R. 


E{SDdFi(x)  } =a 


(C-l) 


can  be  considered  an  a-confidence  statement  that  a  new  observation 


from  F^(x)  will  fall  in  R^  .  In  other  words,  equation  C-l  is  equivalent 
to  saying  that  the  probability  is  CL  that  a  new  observation  from  F^(x) 
will  fall  in  R^. 

Let  X.  .  and  X,  .  be  the  r^  and  s^  order  statistics,  respec- 
(r)  (s) 

tively,  from  the  distribution  F^(x)  ,  r  <  s  <  n^  .  Let  Y  be  a  new 
observation  from  this  distribution.  The  probability  that  Y  falls  in 

<X(r,'  X(s)]  is 


Y  <  X(S)>  =  CS,dX(r)  S"(S>dyfl(X>  ^(r)*  X(s,’  <C-2> 


(r) 


The  integral  with  respect  to  y  is 


l  (S)f!(y)dy  =F1(x(s))-F1(x(r)). 


(r) 


(C-3) 


The  probability  element,  f(x,  x.  v)dx,  „  dx,  v  ,  from  equation  A-7, 

(r)  (s)  (r)  (s) 

Appendix  A  is 
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nr+DdF^jdF^) 

r(r)r(n+l-s)r(s-r) 


r-1 


Vl(x(r)>]  * 


[Fl(x(s)>-Fl<x(r))]8  r 


n  -  s 


(C-4) 


Making  *he  transformation  W  =  F,(X.  .  )-F,(X.  .)  and  Z  =  F_(X,  .)  and 

1  (s)  1  (r)  1  (r) 

proceeding  as  in  equations  A- 8  and  A- 9  of  Appendix  A,  one  obtains 


Pr(X,_,<Y<  X,_J=  \  w* 


r(nj+l) 


(r)  -  (s)  Jn  r(s-r)  r(n  -s  +  r+1) 


n  -s+r 

wS  r  (1-w)  dw.  (C-  5) 


0 


1 


Note  that 


r(nl+1)  s-r-l  V,tr. 

w  (1-w)  dw 


T(s-r)  r(n^-s  +  r+l) 

is  the  probability  density  function  for  the  coverage  of  (X  ,  X 

(r)  (s) J 

(equation  A-9).  Then  the  right  side  of  equation  C-5  is  the  expected  cover- 
age  of  (X  ,  X.  J  .  And,  in  fact,  by  recognizing  that 


C w'-'d-w)  1  dw  = 

*■  '  A 


n.-s  fr  r(s-r-H)  r(n^-s-fr+l) 


n^+2) 


we  obtain 


s  -  r 


Pr<X(r)<Y^X(s),=  TTT  • 


(r)  -  (s)'  ty 

the  expected  coverage  of  (X  ,  X  .  Since  the  blocks  are  statistically 

( r )  ( s )  J 

equivalent,  the  result  holds  for  any  s-r  blocks.  Hence,  it  can  be  general¬ 


ised  that 


E  {[  dF(x)]  =  a 
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where  R^  is  a  distribution-free  tolerance  region,  can  be  considered 
an  tt-confidence  statement  that  a  new  observation  from  F^(x)  will  fall 
in  R^  . 
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REVIEW  OF  CLASSIFICATION  METHODS  IN  PATTERN 
RECOGNITION  GIVEN  TRAINING  SAMPLES 
OF  KNOWN  CLASSIFICATION 

The  following  is  a  review  of  classification  methods  along  the 
guidelines  discussed  in  Chapter  1. 

I.  Optimum  Solution  with  Assumed  Probability  Densities 

Using  this  approach  the  machine  designer  assumes  the  form  of 
the  apriori  and  conditional  probabilities.  Hence  "optimum"  recognition 
is  achieved  if  the  assumptions  are  correct.  A  model  for  the  optimum 
recognition  system  is  shown  in  Figure  D-l.  In  this  figure  v  is  a  vector 
in  measurement  space,  £.  is  the  apriori  probability  of  a  class  i  event, 
f .  (v)  is  the  probability  density  function  of  v  given  that  v  is  a  member 
of  class  i  and  C^(j)  is  the  cost  of  deciding  that  v  is  a  member  of 
class  j  when  it  is  a  member  of  class  i  . 

The  optimum  recognition  system  is  considerably  simplified  if 
certain  assumptions  are  made  about  the  C.(j)  .  Suppose  the  cost  of 
making  an  incorrect  decision  is  equal  to  one  and  the  cost  of  making  a 
correct  decision  is  equal  to  zero.  Then  the  optimum  recognition  system 
consists  of  the  probability  density  computer  £.  L(v),  i  =  1  .  .  .  .  ,K  and  a 
maximum  selector.  The  inputs  to  the  extremum  selector  (whether  max¬ 
imum  or  minimum)  are  called  discriminant  functions. 
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Select 

Kin* 


Figure  D-l.  Optimum  Recognition  System, 


Let  the  discriminant  functions  be  denoted  by  g^(v),  i  =  1 ,  .  .  . ,  K. 

Then  for  the  above  cost  assumptions  an  optimum  recognition  system 

classifies  v  into  class  k  if  g,  (v)  >  g.  (v)  for  all  i. 

—  k  —  —  i  — 

Consider  the  problem  of  finding  the  discriminant  functions,  which 
are  optimum  in  the  Bayes  sense,  when  there  are  K  Gaussian  classes 
with  covariance  matrices  £.  and  mean  vectors  u.  ,  i=l,...,K.  Assume 

—  i  —  i 

also  that  the  cost  of  an  incorrect  decision  is  one  and  that  the  cost  of  a 
correct  decision  is  zero.  The  discriminant  functions  are 


e; 


8i<V)  (2n)K/2|L. 


V2~  exp  -{(v-p/s-Vm.) 
i  =  1 , . .  .  ,  K 


•  (D-D 


where  (•  )  denotes  the  transpose  of  (■  )  .  Let  h.(v)  =  log  g^(v)  . 

Since  the  log  function  is  a  monotonically  increasing  function  of  its  argu¬ 
ment,  the  log  of  g.(v)  can  be  used  instead  of  g^(_v)  without  any  change 
in  the  decision.  In  the  following  log  g^(v)  and  g^(v)  will  both  be  referred 
to  as  discriminant  functions.  The  log  of  g^(v)  for  the  Gaussian  case  is 

h^'v)  =  log  ii  -  y  log  (2,,>  -  j  los  lr|  - 

I  [(v  -  P.)T  -  tj)]  i»l ..... K  .  (D-21 


Therefore,  the  optimum  discriminant  functions  for  Gaussian  patterns 
are  quadratic  functions. 

Consider  the  case  where  the  covariance  matrices  are  equal. 
Keeping  only  the  terms  in  equation  D-2  which  depend  on  i  ,  the  discrim- 
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inant  functions  become 


h  (v)  =  log  £.  +vT  E"1  m-  .  -  7  ^  T  E’1  M- . 
1  “  1  —  —  —1  Z  —1  —  — 1 


i=l , . .  .  ,  K  . 


(D-3) 


These  functions  are  hyperplanes  in  v  space.  Note  that  if  K  =  2  , 
only  one  discriminant  function,  h(v)  =  h  (v)  -  h  (v)  ,  is  needed.  If 
h(v)  >  0  ,  v  is  classified  into  class  1.  If  h(v)  <  0  ,  v  is  classified  into 
class  2.  I  or  K  =  2,  h(v)  is  given  by 

h(v)  =  hj  (v)  -  h2(v)  =  log  ~~~  /  E ~  1  ( M-  j  -  M  - 


1  ^-1  „  1  T  -1 

-1  +  2 -2  -  —2 


(D-4) 


The  decision  boundary,  which  is  obtained  by  setting  h(v)  equal  to  zero, 
is  a  hyperplane  which  bisects  the  line  connecting  the  means.  The  hyper¬ 
plane  is  inclined  to  this  line  at  an  angle  determined  by  the  covariance 
matrix  and  the  relative  positions  of  the  means.  If  the  a  priori  probabilities 
of  each  class  are  equal  and  the  covariance  matrices  are  proportional  to 
the  identity  matrix  (i.e.  L.  =y_I  ,  where  i  =  1 ,2  and  y  is  a  constant) 
equation  D-4  becomes 


b(v) 

a 


(D-5) 


This  is  the  equation  for  a  hyperplane  which  is  the  perpendicular  bisector 
of  the  line  connecting  the  means  of  the  two  distributions. 

If  slight  changes  are  made  in  the  assumptions,  different  decision 
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boundaries  result.  For  example,  if  the  class  densities  are  spherically 
symmetric  Gaussian  densities  which  differ  only  in  location  and  scale, 
the  hypersphere  is  the  optimum  boundary.  Cooper  (1962),  (1963)  further 
investigates  the  hyperplane  and  hypersphere  as  decision  boundaries. 

If  the  mean  and/or  the  variance  is  unknown,  a  "good"  estimate 
for  these  statistics  can  be  used  in  place  of  the  actual  mean  and  variance. 
Such  an  estimate  for  the  mean  is 

N. 

£  (j)  =  4  £J  v.  (j)  (D- 6) 

J  i  =  l 

where  J*  (j)  is  the  estimate  of  the  mean  vector  of  the  j*h  class,  N.  is 
the  number  of  observations  of  the  jth  class,  and  v  .(j)  is  the  i^h  obser¬ 
vation  vector  of  the  jth  class.  A  "good"  estimate  for  the  covariance 
matrix  is 

S(j)  =  ^T  M  (j)]T  .  (D-7) 

j  i=l 

II .  Estimation  or  Approximation  of  the  Probability  Densities 
Many  of  the  authors  in  the  statistical  literature  consider  the  class 
of  probability  density  estimators, 

A  1  n 

f  (v)  =  -  S  w(v ,  V  .)  .  (D-8) 

n  n  j=i  J 

l'  (v)  is  the  estimate  of  the  probability  density  f(v)  when  n  observations 

n 

V V  are  available  from  f(v)  .  The  function  w(v,V.)  might  take 
1  n  J 

a  form  such  as 
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v-V. 


. v-V .  . 


(D-9) 


v-V. 


where  h  is  some  function  of  n  .  Other  weighting  functions  are  given 
in  Table  1,  p.  1068  of  Parzen  (1962). 

Fix  and  Hodges  (1951)  use  the  following  equation  to  estimate  the 


probability  density, 


where 


{Fn(v  +  h)-Fn(v-h,} 
fn(v)  =  - 2h - 


_  .  .  (the  number  of  observations  <  v) 

F  (v)  =  - 1 -  . 


(D- 10) 


The  parameter  h  is  a  function  of  n  which  approaches  zero  as  n 

approaches  infinity.  The  form  of  h  which  gives  the  "best"  results  has 

-a 

not  been  determined.  Rosenblatt  (1956)  lets  h  =  /3 n  and  obtains  the  ft 
and  Oi  which  minimize  the  expected  mean  square  error. 

Loftsgaarden  and  Ouesenberry  (1965)  propose  an  estimator  similar 
to  (D-10)  for  estimating  the  density  function.  Their  estimator  is 


A  ,  .  k(n)  -  1 
f„(V>  =  — 2h” 


(D- 11 


where  k(n)  is  some  integer  which  is  less  than  n  .  That  is,  they  specify 
some  number  k(n)  .  They  then  calculate  the  distance  between  v  and 
the  k(n)^1  closest  observation  to  v  .  This  distance  is  substituted  for  h. 
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Then  rather  than  specify  some  distance  h  and  find  the  number  of  obser¬ 
vations  k  ,  as  Fix  and  Hodges  do,  they  specify  some  number  of  observa¬ 
tions  k  and  find  the  distance  h  .  Consistency  is  shown  for  this  estimator 

when  k(n)  >  00  and  k(n)/n — >  0  as  n — >  00  .  They  indicate  that  k(n)  = 

1/2 

n  gave  good  results  on  some  empirical  work. 

These  concepts  are  readily  applicable  for  estimating  the  density 
function  for  a  D-dimensional  vector  v  .  For  example,  if  Euclidean 
distance  is  used,  equation  D-10  becomes 


A 

f„(v) 


the  number  of  observations  contained 
in  a  hypersphere  of  radius  r  from  v 


n[2r^  uD/2/ D  T  (D/2)] 


(D-12) 


Doftsgaarden  and  Ouesenberry 1  s  estimator  becomes 


A  _  [Mn)  -  1] _ 

n  ~  n[2r°  j D  T(D/2)  ] 


(D-13) 


One  sees  that  these  procedures  are  very  easily  used  in  classifica¬ 
tion.  Rather  than  compute  the  class  probability  densities  at  a  point,  one 
need  only  count  the  number  of  observations  of  each  class  within  an  appro¬ 
priate  radius  from  the  new  observation  _V  .  Under  certain  costs,  a  priori 
probabilities,  and  number  of  observations  from  each  class,  the  new 
observation  is  assigned  to  the  class  which  is  most  heavily  represented  by 

these  observations.  This  procedure,  called  the  k  nearest-neighbor 

1  n 

rule,  is  further,  discussed  in  the  last  section  of  this  appendix. 

Scbostycn  (1962a),  (1962b)  proposes  a  histogram  approach  which 
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involves  the  approximation  of  the  probability  density  with  many  Gaussian 

subdensities.  Consider  this  method  which  is  called  "adaptive  sample 

set  construction"  by  Sebestyen.  Suppose  we  wish  to  distinguish  events  of 

class  F  from  those  of  class  G  .  Let  us  approximate  each  class  probability 

density  with  many  spherically- symmetric  multivariate  Gaussian  densities. 

Suppose  we  use  K  subclasses  to  represent  class  F  and  K  subclasses 

IT  O 

to  represent  class  G.  The  decision  is  made  in  favor  of  class  F  if 


£  p(F.)p_  (v)  >  C  £  p(G .)  p  (v) 

.  i  ir.— ■  j  u .  — 

i=l  i  j=l  j 


D-14) 


where  C  is  a  constant,  p(F^)  is  the  apriori  probability  of  subclass  F^  , 

and  p  (v)  is  the  conditional  probability  of  subclass  F.  .  Since  the 
• .  *■ 
l 

apriori  probability  of  a  certain  subclass  is  not  known,  it  is  estimated. 

•  • 

Letting  M  be  the  number  of  observations  in  F.  and  M  be  the 
i? .  i  G . 

i  J 

number  of  observations  in  subclass  G.  ,  the  decision  rule  for  deciding 

J 

in  favor  of  class  F  is 


£  M  exp 
•  *  *  • 
i=l  l 


-  £  [v  -  m  F.  ] 
.  n  n  i 
n=l 


G 

>  C'  £  M 

.  .  u . 
J=1  J 


-  £  [v  -  m  (G.)] 


n  n  j 


(0-15) 


Here,  m  (F.)  is  the  nth  coordinate  of  the  mean  of  subclass  F.  ,  D  is 

n  l  i 


the  number  of  coordinates,  and  a  is  the  variance.  The  selection  of  the 
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means  of  the  subclasses  can  be  made  as  follows.  Let  us  introduce  the 

first  training  observation  V * .  Suppose  that  is  a  member  of  class  F. 

1  1  2 
Wc  assign  V  to  subclass  with  mean  m^,  =  V  and  variance  a 

2  ^ 

The  value  of  the  variance  a  'is  arbitrarily  Ghosen.  At  this  point  in  the 

procedure,  M  in  the  above  equation  is  equal  to  one.  Now  we  introduce 

1  2  2 

the  second  observation  V  .  If  V  is  a  member  of  class  F  and  within 

a  radius  T  (its  value  is  arbitrary)  of  m  ,  we  set  M  =  2  and  let  m 

1  2  1  1 

equal  the  mean  of  the  lst  two  observations.  If  V  is  a  member  of  class 

F  but  lies  outside  the  sphere  of  radius  T  with  center  at  V*  ,  we  assign 

2  2  2  2 
V  to  subclass  F^  with  mean  m^  =  V  and  variance  cr  .  If  V  is  a 

member  of  class  G  ,  we  assign  the  sample  to  subclass  with  mean 

2  2 

m~  =  V  and  variance  a  •  This  process  is  continued  until  all  training 
~ 

observations  are  exhausted.  It  is  seen  that  this  procedure  approximates 

the  class  probability  densities  by  many  Gaussian  subdensities.  The 

degree  of  approximation  depends  on  the  original  class  probability  densities, 
2  , 

the  variance  a  ,  the  radius  T  ,  and  the  order  in  which  the  samples 
are  introduced.  Waltz  and  Fu  (1965)  have  used  this  general  idea  along 
with  the  gradual  reduction  of  each  subset  radius  to  facilitate  a  more  precise 
boundary. 

Sebestyen  and  Edie  (1966)  have  devised  a  scheme  for  estimating 
a  multi-dimensional  density  using  hyperellipsoids  as  estimation  ceils. 

Their  method  allows  the  size  and  t he  shape  of  the  histogram  cells  to  be 
influenced  by  the  local  distribution  of  the  data.  The  initial  size  and  shape 


of  the  first  histogram  cell  is  chosen  arbitrarily.  The  updating  and 
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generating  of  new  cells  is  similar  to  Sebestyen's  sample  set  construction. 

A  difference  arises  in  the  handling  of  observations  which  fall  outside 

existing  cells  but  nevertheless  "close"  to  the  boundary.  These  events 

are  stored  for  processing  at  a,  later  time  when  the  average  number  of 

element/!  per  cell  reaches  a  certain  threshold.  The  cell  size  may  increase 

or  decrease  as  more  training  observations  are  received  for  classification. 

One  problem  with  this  procedure  is  that  there  are  many  variables  which 

must  be  determined  by  trial  and  error.  When  the  initial  training  is  finished, 

one  may  decide  that  too  few  or  too  many  cells  have  been  generated.  Then 

certain  parameters  must  be  changed  and  the  training  procedure  redone. 

Aizerman  etal.  (1964b)  use  the  method  of  potential  functions 

(orthogonal  functions)  to  approximate  an  unknown  probability  density. 

They  assume  that  the  probability  f(v)  exists  and  that  a  finite  number  N  of 

N 

orthonormal  functions  0.(v)  can  be  selected  so  that  f(  )  =  £  C.0.(v)  . 

1  i=l  1  1 

A  training  algorithm  is  proposed  so  that,  as  the  number  of  independent, 
identically  distributed  training  observations  approaches  infinity,  the 
resulting  function  will  converge  in  probability  to  f(v)  . 

III.  Estimation  or  Approximation  of  the  Class  Discriminating 
Boundaries 

The  object  here  is  to  assume  a  form  for  the  decision  boundary 
and  to  locate  the  boundary  so  that  the  best  possible  recognition  is  obtained. 

A  simple  boundary  like  a  hyperplane  or  hyper  sphere  is  usually  employed. 

Let  us  first  consider  the  use  of  the  hyperplane.  This  has  been 
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thoroughly  treated  in  the  literature,  Highleyman  (1962),  Cooper  (1962), 
(1963),  Albert  (1963),  Peterson  and  Mattson  (1966),  Wolff  (1966).  We  have 
seen  that  the  hyperplane  is  the  optimum  boundary  for  discriminating 
between  2  classes  which  are  described  by  normal  distributions  with  equal 
covariance  matrices  and  different  means.  It  can  also  be  seen  that  the 
hyperplane  is  the  optimum  boundary  for  two  classes  which  are  equally 
probably  a  priori,  have  equal  costs  of  misrecognition,  and  have  probability 
densities  which  are  eliipsoidally  symmetric  with  equal  eccentricities 
and  monotonically  decreasing  from  the  mean,  c.f.  Cooper  (1962). 

It  turns  out  that  the  hyperplane  is  the  optimum  discriminant  for 
other  cases.  For  example,  suppose  we  wish  to  discriminate  between  two 
classes  where  the  sample  vectors  v  consist  of  D  binary  components, 
either  zero  or  one.  If  the  components  of  v  are  statistically  independent, 
a  linear  disc riminant  function  is  optimum,  c.f.  Minsky  (1961),  Nilscn 
(1965). 

K(K- 1) 

One  needs,  in  general,  - - -  hyperplaies  to  separate  K 

classes.  Some  of  these  hyperplanes  may  not  be  needed  depending  on  the 
location  and  shape  of  the  pattern  classes.  For  example,  it  may  be 
possible  to  separate  each  class  from  all  the  remaining  classes.  In  this 
case  only  K- 1  hyperplanes  are  required. 

Suppose  that  we  wish  to  distinguish  between  3  classes  by  using 
hyperplanes.  A  general  two-dimensional  situation  is  shown  in  Figure 
D-2.  The  classes  are  represented  by  the  circles  labeled  (1),  (2),  ar.c  (3). 
Boundary  13. ^  separates  class  i  from  class  j  .  Two  problems  become 
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evident.  (1)  For  best  results,  the  3  hyperplaned  (lines  in  the  figure) 
cannot  be  positioned  independently  of  one  another.  (2)  A  region  may  appear 
in  which  an  observation  may  be  classified  as  belonging  to  any  of  the  three 
classes.  This  region  is  known  by  many  names  such  as  void  region, 
region  of  indecision,  deferred  decision  region,  or  reject  region.  This 
is  the  crosshatched  region  of  Figure  D-2. 

Consider  problem  (1).  Suppose  the  objective  is  to  minimize  the 
number  of  training  observations  which  are  misclas sified.  For  best  per¬ 
formance,  the  boundaries  should  be  determined  simultaneously.  However, 

K(K- 1) 

the  simultaneous  location  of  - - - hyperplanes  for  minimum  misclass- 

ification  is  often  very  difficult.  Hence  each  hyperplane  is  usually  posi¬ 
tioned  sequentially.  After  the  hyperplanes  are  located,  the  results  may 
not  be  as  good  as  expected.  In  this  case  Highleyman  (1962)  suggests  an 
iterative  procedure  in  which  all  subsequent  hyperplanes  are  located  by 
vising  only  the  observations  which  are  correctly  classified  by  the  previously 
located  hyperplanes.  That  is,  if  in  Figure  D-2  is  located  first, 

then  B  is  located  using  only  the  observations  of  class  2  which  are 

Cm 

correctly  classified  by  B  . 

X  c* 

Now  consider  the  void  region  in  which  a  new  observation  may  be 
classified  into  class  1,  2,  or  3.  For  example,  an  observation  in  the 

crosshatched  region  of  Figure  D-2  lies  to  the  class  1  side  of  B  and 

1  Cm 

to  the  class  3  side  of  B^.  Therefore  if  this  observation  is  compared  to 
B  ^  and  then  to  ,  it  is  classified  into  class  3.  However,  if  the 

observation  is  compared  to  B^  an<^  then  B^  >  it  is  classified  into 
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class  1.  Note  that  a  void  region  will  not  occur  if  the  class  probability 
densities  are  Gaussian  with  equal  variances  and  different  means.  This 
is  because  the  optimum  decision  hyperplanes  in  this  case  are  the  perpen¬ 
dicular  bisectors  of  the  lines  joining  the  means  of  the  classes.  Note  also 
that  if  the  observations  are  transformed  into  a  space  where  the  likelihood 
ratios  act  as  coordinates,  no  void  region  results  when  Bayes  criterion 
is  used,  c.f.  Van  Trees  (1968). 

It  was  shown  in  equations  D-l,  D-2,  and  D-3  that  the  hyperplane 
is  the  optimum  decision  surface  for  separating  two  classes  which  have 
Gaussian  distributions  with  different  means  and  equal  covariance  matrices. 
Anderson  and  Bahadur  (1962)  have  investigated  linear  procedures  for 
classifying  observations  from  Gaussian  distributions  with  unequal  covari- 
iance  matrices.  They  give  methods  for  constructing  a  hyperplane  which 
minimizes  one  probability  of  misclassification,  given  the  other,  and  for 
constructing  the  optimum  hyperplane  when  a  minimax  criterion  is  used. 

Now  assume  that  H(v)  is  unknown  as  it  is  in  most  pattern 
recognition  problems.  Highleyman  (1962)  proposes  that  the  optimum 
hyperplane  be  determined  by  a  search  through  a  set  of  hyperplanes  for 
one  which  minimizes  the  maximum  likelihood  estimate  of  the  expected 
risk.  He  suggests  that  the  expected  risk  be  estimated  by 

C.(j)e.  (jH-C.(i)e.(i) 
i  i  J  J 

N 

where  e.(j)  is- the  number  of  observations  from  class  i  which  are  class¬ 
ified  into  class  j  and  N  is  the  total  number  of  observations  which  are 
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used  in  the  estimate.  If  the  apriori  probabilities  £.  are  known  before¬ 
hand,  the  number  of  class  i  observations,  n.  ,  which  are  used  in 

making  the  estimate  should  be  determined  by  n.  =  £.  N. 

1  1 

It  is  seen  that  the  estimate  of  the  expected  risk  is  a  discontinuous 
function  because  it  is  determined  by  a  finite  number  of  discrete  observa¬ 
tions.  This  prohibits  the  use  of  a  gradient  method  in  a  search  for  the 
minimum  risk.  Highleyman  chooses  to  approximate  the  abrupt  change 
in  risk,  when  the  hyperplane  is  moved  from  one  side  of  an  observation 
to  the  other,  by  a  continuous  function  of  distance  from  the  observation  to 
the  hyperplane.  A  convenient  approximation  for  this  step  function  is  the 

Gaussian  cumulative  distribution  with  mean  at  the  hyperplane  location 

2 

and  variance  a  .  The  risk  function,  which  is  the  sum  of  these  functions 
over  dll  observations,  is  then  minimized  with  respect  to  the  hyperplane 
coordinates  by  the  method  of  steepest  descent.  The  variance  (jZ  is 
reduced  and  the  process  repeated  until  the  desired  recognition  accuracy 
is  achieved. 

Wolff  (1966)  states  that  the  method  of  steepest  descent  contains 
the  inherent  disadvantage  that  a  relative  instead  of  an  absolute  minimum 
may  be  obtained.  He  employs  a  variant  of  the  "creeping  random 
methou''  by  Brooks  (1958)  which  Wolff  states  is  more  appropriate  for  dis¬ 
continuous  functions  and  has  less  chance  of  yielding  a  relative  minimum. 

T 

The  equation  of  a  hyperplanc  in  D  dimensions  is  w  v  =  0  ,  where 
T 

w  =  [www]  and  v  = 


WoTf's  method  consists  of  first 


selecting  a  starting  position  for  the  hyperplane.  Let  the  initial  weight 

0 

vector  w  be  w  .  The  error  rate  is  then  determined  for  this  hypcrplare. 
The  hyperplane  is  given  a  displacement  (rotation  and/or  translation)  and 
the  error  rate  is  calculated  for  the  hyperplane  in  its  new  position.  Let 
the  new  position  be  described  by  w"^  =  w^  +  Aw^  ■  If  the  error  rate  in  the 
new  position  is  less  than  the  error  rate  in  the  former  position,  the  hyper¬ 
plane  is  given  a  displacement  from  the  new  position.  In  this  case 
2  1  2 

w  =  w  +  Aw  .  Otherwise  the  hyperplane  is  given  a  displacement  from 

1  0  2 

the  former  position  (w  =  w  +  Aw  ).  The  increments  of  displacement 
1  2 

Aw  ,  Aw  ,  .  .  .  are  chosen  from  a  random  number  generator.  Wolff  uses 

2 

a  Gaussian  distribution  with  zero  mean  and  variance  cr  to  generate  the 
random  numbers.  The  variance  is  held  constant  in  the  early  stage  of 
the  process  and  then  reduced  during  the  final  stages. 

The  displacement  of  the  hyperplane  depends  upon  its  starting 

position  w1  as  well  as  the  increment  Aw1^.  For  certain  w*  ,  the 

ltl  i 

increment  Aw  needs  to  be  larger  than  it  would  for  other  w  for  the 

same  relative  displacement.  Wolff  eliminates  this  problem  by  describing 

the  hyperplane  by  a  point  on  a  unit  sphere  in  w  space.  Such  a  point  is 

given  by 


w. 

l 


cos  (0  ) 

*  • 
l 

cos  (0.  )  TT  sin(0  ) 

i+l  r=1  r 

T  sin  (0r) 
r  =  l 


i  =  0 
i  i  0,D 

i  =  D 
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He  then  uses  the  creeping  random  method  to  increment  the  0  .  Questions 
remain  about  (1)  the  rate  of  convergence,  (2)  termination  of  the  search, 
and  (3)  the  best  statistics  for  the  random  number  generator  when  the 
creeping  random  method  is  used. 

IV.  Other  Intuitive  Criteria 

Many  ad  hoc  optimization  criteria  have  been  proposed  for  the 
solution  of  pattern  recognition  problems.  Most  of  these  criteria  are 
intuitively  appealing  and  offer  a  so-called  "optimum"  solution  without 
the  use  of  the  class  probability  densities. 

Consider  a  criterion  which  was  proposed  by  R.  A.  Fisher  (1925). 
Suppose  there  are  two  classes  in  D-dimensional  space.  Suppose  it  is 
desirable  to  project  these  classes  onto  a  line  so  that  the  "distance" 
between  the  classes  is  as  large  as  possible.  A  threshold  can  be  set  along 
this  line  and  classification  of  new  observations  begun. 

To  make  the  "distance"  between  the  classes  large,  Fisher  max¬ 
imizes  the  scatter  between  the  classes  while  keeping  the  scatter  among 
observations  of  the  same  class  constant.  Let  W  be  the  linear  transfor¬ 
mation  to  do  this.  The  problem  then  is  to  maximize 

V'  T  T  2 
>  (W  V.-  W  V.) 

V.  c  class  1  J 

—  i 

V.  e  class  2 

“J 

while  constraining  the  following  equation  to  be  equal  to  a  constant. 
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ET  T  2 
(W  V  -  W  V.)  + 

V.  c  class  1  ^ 

— 1 

V.  c  class  1 


ZT  T  2 

(W  V.  -  WAV.f  =  C  . 

V.  e  class  2  1  ^ 

—  l 

V.  e  class  2 
~J 


The  solution  W  is  the  eigenvector  of  the  largest  eigenvalue  of 


(BA-1-  A  I)  =  0  . 


where 


A  =  E  (Yi  ■“])  <Yj -E!i)T  +  E  (Yi  -  ^Kv.-n^) 


V.  c  class  1 

—  X 


Vj  c  class  2 


is  the  intraset  scatter  matrix,  and  B  is  the  inter  set  scatter  matrix, 


B-  E 

all  classes 


m  j  and  are  the  sample  means  of  class  1  and  class  2,  respectively. 

Sebestyen  (1961),  (1962a)  advocates  maximizing  the  scatter  between 
classes  (interset  distance)  while  keeping  the  total  scatter  constant  (sum 
of  interset  and  intraset  distances).  As  expected,  this  method  also  yields 
an  eigenvalue  problem  (BA  -XI)  =  0  where  A  and  B  are  now  given  by 

A=  22  (V.  -  V.)(V.  -  V  )T 

all  classes  J  1  J 

and 

•  B  =  J2  <Yj  -  Y;)(Y-  ■  Y/  • 

V. e  class  1  J  1  J 

—  l 

V . e  class  2 
“J 

One  problem  with  these  methods  is  that  the  inverse  of  a  large  matrix  has 
to  be  computed.  If  one  decides  to  minimize  the  total  scatter  while  con- 
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straining  W  (e.g.  £  W.  =  1)  ,  A  disappears  from  the  equation. 

i  1 

However,  this  approach  does  not  yield  a  suitable  solution  to  such  a  simple 
problem  as  shown  in  Figure  D-3.  The  ellipses  labeled  (1)  and  (2)  repre¬ 
sent  the  contours  of  constant  probability  for  the  classes.  A  line  which 
suitably  discriminates  between  the  two  classes  is  labeled  as  the  "ideal 
discriminant".  This  line  is  contrasted  with  the  line  which  minimizes  the 
total  scatter. 

Sebestyen  (1962a)  offers  a  nonlinear  approach  to  discriminating 

between  patterns.  He  approximates  a  generalized  discriminant  with  a 

polynomial  function  of  the  coordinates  of  the  measurement  space.  A 

search  is  performed  over  polynomials  of  various  order  starting  with  the 

first  order  polynomial  and  continuing  with  higher  order  polynomials  until 

a  suitable  categorizer  is  found.  However,  for  a  high  dimensional  space 

and  a  high  order  polynomial,  this  approach  ~an  be  very  Hme  consuming. 

Widrow  and  Hoff  (I960)  introduce  a  performance  criterion  which 

states  that  in  a  two  class  problem  the  distance  within  the  classes  should 

be  minimized  about  two  fixed  points.  They  devise  an  iterative  procedure 

for  the  linear  separation  of  binary  patterns.  Patterson  and  Womack  (1966) 

use  this  criterion  and  a  nonlinear  discriminant  function.  They  assume 

n 

a  discriminant  function  of  the  form  u(W  ,  Vt  =  £  W.0.(V)  ,  where  the 

-  -  i=l  i  i  ~ 

0.  (v)  are  given.  They  train  the  machine  tc  approximate  a  discriminant 
function  which  maps  class  1  observations  :c  point  and  class  2  obser¬ 

vations  to  point  -K^.  A  search  teclmique  is  used  to  minimize  the  mean 
square  deviation  of  u(W,  V)  from  these  points.  That  is,  they  minimize 
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Figure  D-3*  discriminant  Functions. 


Figure  D-4.  Elementary  Decision  Hules. 
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where  (  •  )  ,  .  denotes  the  average  over  the  class  i  observations, 

class  1  a 

There  are  many  simple  decision  rules  which  are  based  upon 
regression,  Euclidean  distance,  or  correlation.  For  example,  a  mean 
square  regression  line  (curve,  surface,  etc.)  can  be  determined  for  each 
class.  One  then  decides  that  a  new  observation  V  is  a  member  of  a 
certain  class  if  it  is  closer  to  the  regression  line  of  that  class  than  to  the 
regression  line  of  any  other  class. 

Another  simple  decision  rule  is  based  upon  the  Euclidean  distance 
from  V  to  a  characteristic  point  of  the  classes  (e.g.  the  class  sample 
means).  The  rule  decides  that  V  belongs  to  a  certain  class  if  V  is 
closer  to  the  characteristic  point  of  that  class  than  to  the  corresponding 
characteristic  point  of  any  other  class. 

In  still  another  rule  it  is  decided  that  V  belongs  to  class  1  if  its 
dot  product  (i.  e.  correlation)  with  the  sample  mean  of  that  class  is 
larger  than  its  dot  product  with  the  sample  mean  of  any  other  class. 
Figure  D-4  illustrates  these  procedures  based  on  the  sample  means  m  ^ 
and  m^  of  class  1  and  class  2,  respectively.  One  can  see  that  these 
simple  schemes  only  work  well  for  certain  class  configurations. 

Fix  and  Hodges  (1951)  propose  that  an  unknown  observation  V  be 
classified  as  a  member  of  the  class  of  its  nearest  neighbor  (as  described 
by  an  arbitrary  metric).  This  is  the  famous  nearest-neighbor  (NN)  rule. 

A  more  complicated  rule,  the  k  nearest-neighbor  (k  -  XNl  rule, 

n  n 


assigns  an  unclassified  point  to  the  class  most  heavily  represented  among 

its  k  -nearest  neighbors.  Fix  and  Hodges  establish  the  consistency  of 

k 

the  k  -NN  rule  for  sequences  k  — >  «  such  that - >  0  .  Also,  Fix 

n  n  n 

and  Hodges  (1952)  present  a  numerical  investigation  of  the  small  sample 
performance  of  the  NN  rule  and  the  3-NN  (k^=3)  rule  under  the  assump¬ 
tion  of  normal  statistics. 

Cover  and  Hart  (1967)  show  that  the  probability  of  error  for  the 

nearest-neighbor  rule  is  less  than  twice  the  Bayes  probability  of  error, 

based  on  an  infinite  sample.  They  further  demonstrate  that  the  NN  rule 

is  admissible  among  the  k  -NN  rules  for  certain  classes  of  distributions. 

n 

These  are  the  classes  of  distributions  for  which  the  distance  between  any 
two  elements  of  the  same  class  is  less  than  the  distance  between  any  two 
elements  of  different  classes.  This,  of  course,  rules  out  any  possibility 
of  overlap  between  the  classes.  Consider  a  demonstration  of  the  admis¬ 
sibility  of  the  NN  rule  among  the  k^-NN  rules.  Suppose  one  class  is 
uniformly  distributed  over  the  interval  [-2,  -l]  and  the  other  class  is 
uniformly  distributed  over  the  interval  [1,2]  ,  both  on  the  real  line. 
Suppose  that  the  apriori  probability  of  occurrence  of  each  class  is  equal 
to  ~  .  Let  n  training  observations  be  taken.  Suppose  that  a  new  observa¬ 
tion,  which  is  to  be  classified,  falls  in  the  interval  [1,2]  .  To  make  an 

error  by  the  NN  rule  all  of  the  n  training  observations  must  fall  in 

1  n 

[-2,  -l]  .  The  .probability  of  this  occurring  is  (— )  .  To  make  an  error 

by  the  k  -NN  rule,  where  k  is  odd,  (k  - 1 ) / 2  or  more  of  the  n 
'  n  n  n 
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observations  must  fall  in  [-2,  -l]  .  The  probability  that  (k  -  1 ) / 2  of 

n 

(k  -l)/2 
1  "v 

the  n  observations  fall  in  [-2,  -1]  is  (— )  2^  (? )  •  This  is,  o 

j=l  J 

1  n 

course,  greater  than  (— )  .  •  Thus  admissiblity  is  proved  for  this  set 


of  distributions. 

The  nearest-neighbor  rule  is  sensitive  to  spurious  information 

since  an  annulus  is  formed  about  an  observation  from  one  class  which 

falls  in  a  region  surrounded  by  observations  from  different  classes. 

Sebcstyen  (1962c)  mentions  two  rules  which  eliminate  or  greatly  reduce 

the  possibility  that  an  annulus  forms  around  spurious  observations. 

These  rules  have  the  same  effect  as  the  k  -NN  rule  in  that  more  than 

n 


one  nearest  neighbor  is  considered.  Let  d(V  ,  f  )  be  seme  distance 

—  —  m 


measure  from  point  V  to  the  m^1  element  of  class  F  .  The  1 rule 


decides  that  V  is  a  member  of  class  F  if 


1 


Mp 

£  “k 
m=i  d  (  V  ,  f  ) 
- m 


1 


Mq 

z  - 

s  =  l  d*(V,£  ) 
—  *-s 


where  M  is  the  number  of  elements  in  class  F  and  k  is  an  arbitrary 
r 

number  chosen  to  determine  the  neighborhood  of  V  which  is  to  influence 
the  decision.  The  second  rule  decides  in  favor  of  class  F  if 


Mp 

E 

m=  1 


1 


1  + 


mg 

>  E 

s=  1 


i  ♦  (^)k 
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Note  that  for  Euclidean  distance  and  for  a  large  k  ,  this  rule  essentially 
counts  the  number  of  elements  from  class  F  contained  in  a  radius  r 
from  v  and  compares  this  with  the  number  of  elements  from  class  G  in 
that  same  radius. 

As  has  been  seen,  many  classification  procedures  are  available 
for  use  in  pattern  recognition.  Since  these  decision  procedures  are  diffi¬ 
cult  to  compare,  some  ambiguity  is  involved  in  the  choice  of  a  decision 
procedure.  The  procedure  to  use  depends  on  the  amount  of  information 
available,  the  desired  complexity  of  the  decision  procedure,  and  personal 
preference . 


t 
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ABSTRACT 


Signals  reflected  from  irregular  time  varying  boundaries  such  as 
the  sea  surface  undergo  distortion  which  limits  their  detectability 
and  useability  for  tracking.  The  properties  of  this  distortion  for 
correlator  processing  are  herein  related  to  the  statistical  constraints 
placed  upon  the  time  variation  and  irregularity  of  the  boundary.  Two 
propagation  geometries  are  analysed.  The  first  deals  with  the  cross¬ 
correlation  of  surface  reflected  and  direct  transmission  paths,  and 
the  second  with  the  cross-correlation  of  surface  scattered  signals 
received  at  two  different  locations.  This  analysis  assumes  that  the 
signal  generated  at  the  target  and  the  background  noise  are  both 
gaussian  random  variables.  Three  models  of  the  scattering  mechanism 
are  proposed  and  two  are  analysed  in  detail.  In  all  cases  the 
correlator  output  is  shown  to  exhibit  very  persistent  fluctuations 
due  to  the  scattering.  The  existence  of  these  fluctuations  is  related 
to  the  non-gaussian  nature  of  the  scattered  signals.  The  fourth  order 
cumulant  is  computed  to  show  that  well  spaced  scattered  signal 
samples  may  be  dependent  even  when  they  are  uncorrelated.  Results 
are  presented  for  Low  pass  signal  spectra  and  are  inve  tigated  as  a 
function  of  bandwidth.  When  the  receiver  is  constrained  to  be  steered 
"on  target"  only  the  signal  energy  that  is  coherent  between  various 
paths  contains  information  useful  for  detection  or  tracking.  However, 
when  the  receiver  is  not  so  constrained,  signal  scattering  of  a  delay 
modulated  nature  is  shown  to  be  useful  for  detection. 


B-iil 


Tabic  of  Contents 


ABSTRACT  llt 

CHAPTER  I,  INTRODUCTION 

1.0  Preliminary  Remarks  B-l 

1.1  Description  of  the  General  Problem  and 

Preview  of  the  Results  B“7 

1.2  A  Brief  Historical  Summary  of  Surface 

Scattering  and  the  Motivation  for  the 

Current  Research  B-12 

1.3  Scattering  from  Time  Varying  Surfaces, 

The  Extended  Kirchhoff  Integral 

Equation  B-20 

1.4  The  Approximate  First  Order  Solution  of 

the  Kirchhoff  Integral  Equation  8-27 

CHAPTER  II,  PRELIMINARY  DISCUSSION  OF  SIGNAL  AND 
SYSTEM  PROPERTIES 

2.0  Introduction  B-35 

2.1  Concerning  the  Nature  of  the  Target  Signal  B-35 

2.2  System  Function  Description  for  Linear 

Time  Varying  Filters  .  B-A2 

2.3  Cascaded  Linear  Time  Varying  Systems  B-51 

2.4  Stationary  Stochastic  System  Correlations  B-54 

2.5  Time  Invariant  and  Slowly  Varying  Systems  B-66 

CHAPTER  III,  STATISTICS  OF  THE  FINITE-TIME  CORRELATOR 

3.0  Introduction  B-68 

3.1  Cross-Correlation  of  Direct  and  Surface 

Reflected  Paths  B-69 

3.2  Second  Order  Statistics  for  the  Output 

of  the  Correlator  B-72 

3  3  Correlator  Fluctuation  for  the  Direct  and 

Surface  Reflected  Multipath  Processor  B-76 


3.4 

Two  Receiver  Array  Cross-Correlator 
Processing 

B-81 

3.5 

Cross-Correlator  Fluctuation  for  the 

Two  Receiver  Array 

B-84 

3.6 

Correlator  Tracking  Error 

B-92 

3.7 

Concerning  the  Departure  of  the  Statistics 
of  the  Scattered  Signals  from  Gaussian. 

B-95 

3.o 

The  Two  Sided  Likelihood  Decision  Scheme 
for  the  Correlator  Detector 

B-99 

CHAPTER  IV 

4.0 

Introduction 

B-104 

4.1 

First  and  Second  Order  Statistics  for 
the  Random  Amplitude  and  Delay  Model 

B-105 

4.2 

Multipath  Correlator  Fluctuations  for 
the  Random  Amplitude  and  Delay  Model 

B-109 

4..  3 

Second  and  Fourth  Order  Cross  System 
Statistics  for  the  Random  Amplitude 
and  Delay  Model 

B-125 

4.4 

Array  Correlator  Fluctuations  for  the 

Two  Channel  Random  Amplitude  and  Delay 
Model 

B-129 

CHAPTER  V, 

THE  RANDOMIZED  SINUSOIDAL  SURFACE  MODEL 

5.0 

Introduction 

B-137 

5.1 

Gulin's  Solution  for  H(w,t)  and  the 
Associated  Impulse  Response 

B-138 

5.2 

First  and  Second  Order  Statistics  of  the 
Sinusoidal  Boundary  Model 

B-143 

5.3 

Multipath  Correlator  Fluctuations  for  the 
Random  Sinusoidal  Boundary 

B-154 

5.4 

Second  and  Fourth  Order  Cross  System 
Statistics  for  the  Two  Channel 

Random  Sinusoidal  Boundary 

B-162 

CHAPTER  VI, 

6.0 

6.1 

APPENDIX  A, 
APPENDIX  B, 
APPENDIX  C, 
APPENDIX  D, 
APPENDIX  E, 
APPENDIX  G, 

APPENDIX  H, 

APPENDIX  I, 
APPENDIX  J, 


POSSIBLE  EXTENSIONS  OF  THE  PRESENT  WORK 
AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 

l 

Extensions  to  the  Fully  Random  Boundary  B-173 

Other  suggested  Topics  for  Future  Research  B-183 

MORGAN'S  DERIVATION  OF  EQUATION  (1.3-14).  B-184 

PLANE  WAVE  EXPANSION  FOR  SURFACE  SCATTERING  B-186 

SPECULAR  POINT  EXPANSIONS  FOR  r  end  r'  B-189 

REFINEMENTS  IN  THE  CRITERIA  FRO  DETECTABILITY  B-196 

VARIANCE  INTEGRALS  FOR  THE  RANDOM  AMPLITUDE  AND 
DELAY  MODEL  B'201 

TABULATION  OF  THE  FIRST  FEW  G  .  (p,q,T)  B“209 

U|l  )« 


VARIANCE  INTEGRALS  FOR  THE  RANDOM  SINUSOIDAL 
BOUNDARY  MODEL  B-212 

NUMERICAL  EVALUATION  OF  THE  G**(z)  B-215 

EXTENSIONS  TO  LARGE  ARRAYS  B-222 


* 


B-vii 


CHAPTER  I 


INTRODUCTION 

1.0  Preliminary  Remarks 

Two  important  techniques  that  have  been  used  to  improve  the 
performance  of  Sonar  detection  and  communication  systems  are  multipath 
and  space  diversity  signal  processing.  Multipath  signal  processing 
capitalizes  on  the  signal  replication  or  echoing  that  characterizes 
propagation  from  source  to  receiver  along  many  paths.  Similarly,  space 
diversity  processing  takes  advantage  of  the  signal  replication  which 
occurs  at  an  array  of  spatially  separated  receivers  when  transmission 
is  from  a  common  source. 

Multipath  Sonar  processing  is  generally  difficult  to  implement  in 
practice  since  it  usually  requires  a  detailed  knowledge  of  the 
propagation  geometry.  When  this  information  is  not  available  or  when 
it  is  difficult  to  estimate,  multipath  effects  are  more  often  regarded 
as  a  hindrance  than  as  an  aid.  However,  multipath  propagation  has 
been  studied  with  gTeat  interest  for  range  and  depth  estimation  in 
tracking.  Furthermore,  in  certain  receivers  concerned  with  the 
detection  of  signals  of  unknown  spectrum,  multipath  replication  is  the 
sole  distinguishing  feature  which  can  be  used  to  discriminate  between 
targets  and  noise. * 

Space  diversity  processing  on  the  other  hand  is  more  commonly 

exploited  and  more  thoroughly  understood.  Here  the  design  of  signal 

processors  is  not  as  dependent  on  knowledge  of  specific  propagation 

geometry  and  it  is  therefore  less  sensitive,  more  flexible,  and  easier 

% 

to  implement.  Moreover,  such  signal  processing  permits  discrimination 
against  non-directional  background  noise.  This  noise  rejection  car 
frequently  be  improved  by  simply  increasing  the  number  of  receiving 
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elements.^ 

Clearly,  propagation  geometries  exist  which  include  both  multipath 
and  space  diversity  replication.  Figure  1.0-1  illustrates  some  typical 
propagation  geometries.  Case  (c)  falls  into  this  mixed  space  and 
multipath  category.  The  receivers  in  these  examples  are  assumed  to  be 
single-site  sensors  located  at  various  points  in  space.  Each  sensor  is 
assumed  to  be  directional  enough  to  select  either  by  design  or  by 
accident  certain  ray  paths  for  reception.  Case  (b)  is  an  example 
which  illustrates  two  single-site  receivers  which  suppress  the  direct 
paths  of  transmission. 

When  the  reflections  from  the  boundary  do  not  change  the  incident 

signal  (except  possibly  for  sign)  then  the  replication  is  termed 

perfect.  This  occurs  when  sound  reflects  from  a  completely  smooth 

air-water  interface  when  the  sound  is  incident  from  within  the  water 

medium.  In  this  case  the  reflection  is  locally  characterized  by  a 

3 

pressure-release  boundary  condition. 

However,  when  the  boundary  is  deformed  spatially  in  some  random 
fashion  then  the  signal  replication  becomes  distorted.  The  spatial 
deformations  produce  frequency  dependent  interference  effects.  At 
certain  frequencies  and  locations  these  interference  effects 
superpose  constructively  to  enhance  the  strength  of  transmission.  At 
other  frequencies  or  locations,  however,  the  interference  is  found  to 
be  destructive  producing  poor  transmission. 
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(a)  Single  Multipath  Receiver 
Using  Direct  and  Surface 


(b)  Two  Element  Vertical  Array 
Using  Only  Surface 
Reflected  Paths 


Reflected  Paths 


(c)  Two  Element  Horizontal  Array 
Using  Direct  and  Surface 
Reflected  Paths 


(d)  Multi-receiver  Array 
Using  Only  Surface  Paths 


Figure  1.0-1  Typical  Propagation  Geometries 


Involving  Surface  Scattering 


Constructive  interference  occurs  at  any  given  frequency  when 

transmissions  over  various  paths  reflecting  from  randomly  oriented 

facets  of  the  irregular  boundary  arrive  at  the  receiver  in  phase  with 

each  other.  For  far- field  reception  this  results  from  spatial 

periodicities  in  the  boundary  deformations  which  give  rise  to  effective 

path  length  differences  which  are  multiples  of  the  wavelength  X  of  the 
4 

radiation.  Similarly,  the  degree  of  power  loss  during  destructive 

interference  is  determined  by  the  likelihood  that  the  boundary  facets 

position  and  align  themselves  in  such  a  way  as  to  consistently  divert 

energy  away  from  the  receiver.  From  the  point  of  view  of  multiple 

transmission  paths,  effective  path  length  differences  produce  signal 

cancellation.  The  properties  of  the  scattered  radiation  are  thus  seen 

to  be  linked  to  the  statistical  properties  of  the  boundary  deformations. 

In  particular,  the  two  dimensional  space  spectrum  describing  the 

harmonic  content  of  the  surface  irregularities  plays  an  important  role.5 

In  connection  with  this  purely  spatial  redistribution  of  reflected 

energy,  the  interference  due  to  surface  deformations  introduces  frequency 

selective  transmission  properties.  These  produce  spectral  alterations 

in  the  scattered  signal  which  invariably  cause  correlation  degradation 

and  consequently  poorer  signal  detectability.  In  addition,  the 

spatial  extent  of  the  active  scattering  area  produces  a  general 

spreading  or  smearing  of  signal  correlation  over  arrival  time.^ 

The  irregular  surfaces  considered  here  are  assumed  to  be 

instantaneous  realizations  of  a  stochastic  ensemble  of  such  surfaces. 

% 

Each  surface  ensemble  member  generally  produces  a  different  scattered 
signal.  If  the  boundary  deformations  change  in  shape  as  a  function 
of  time  then  the  scattered  signal  exhibits  fluctuations  due  to  this 
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motion.  For  severe  scattering  this  fluctuation  can  turn  into  deep 

fading.  When  the  time  variations  are  very  slow,  signal  processors 

operating  on  the  scattered  signals  are  occasionally  confronted  with 

low  signal  to  noise  ratios  for  long  periods  of  time. 

If  the  time  variations  of  the  surface  is  assumed  to  satisfy  the 

two  dimensional  wave  equation  then  the  periodic  or  harmonic  components 

of  the  surface  propagate  at  constant  velocities  in  various  directions. 

If  the  incident  radiation  is  monochromatic  these  surface  motions 

generate  side  bands  near  the  frequency  of  the  incident  radiation  which 

7 

are  easily  observed.  In  this  case  a  discrete  harmonic  space 

component  of  the  surface  with  temporal  period  of  2w/ft  which  produces 

constructive  interference  due  to  transmission  path  length  differences 

50 

of  nX  generates  signal  sidebands  at 


< ii 


±  nfl 


(1.0-1) 

where  c  is  the  signal  propagation  velocity.  Similarly,  surfaces  with 
diffuse  space  components  produce  a  general  smearing  of  the  signal 
spectrum. 

g 

In  agreement  with  the  generally  accepted  definition,  the 
constructive  superposition  at  a  certain  location  and  frequency  due  to 
transmission  with  path  length  differences  at  nX  is  termed  a  spectral 
order.  We  call  n  the  order  of  interference.*  It  is  clear  that  the 
number  and  strength  of  orders  that  are  "seen"  by  the  receiver  depend 
on  the  directivity  and  orientation  of  the  sensors.  With  highly 

9 

directive  sensors  it  is  possible  to  observe  single  orders. 


B-5 


At  hitter  acoustic  frequencies  the  number  and  spatial  density  of 
these  orders  increases.  Consequently,  the  degree  of  frequency  smear  or 
shifting  which  can  occur  in  accordance  with  Equation  (1.0-1)  becomes 
larger.  Furthermore,  the  high  frequency  components  of  the  incident 
signal  which  constitute  the  fine  structure  of  the  signal  correlation 
are  especially  susceptible  to  temporal  smearing.  Hence,  it  is  desirable 
to  have  a  simple  criterion  by  which  one  can  judge  whether  significant 
time  or  frequency  smear  occurs  during  reflection.  The  most  universally 
accepted  criterion  is  that  of  Rayleigh.^  It  states  that  if  the  angle 
of  grazing  for  incident  radiation  is  ^  and  if  the  maximum  height  of 
the  surface  irregularities  is  h  then  the  surface  appears  smooth  if 

kh  SinOy  «  1  (1.0-2) 

where  k  *  2tt/X .  Although  many  theoretical  efforts  have  been  made  to 
refine  this  crude  rule-of-thumb  it  remains  one  of  the  most  satisfactory 
criteria  available  as  will  be  shown  in  chapter  5. 


* 
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l . l  Description  of  the  General  Problem 
and  Preview  of  the  Results 

This  report  deals  with  the  passive  sonar  detection  of  targets 
generating  zero  mean  Gaussian  noise-like  signals.  The  detector  is 
assumed  to  use  echos  or  replicas  of  the  signals  which  are  distorted  by 
scattering  from  an  irregular  time-varying  pressure  release  surface. 

The  target  is  assumed  to  radiate  in  an  omni-directional  manner  into  a 
uniform,  isovelocity  medium  which  is  characterized  by  rectilinear  or 
straight  line  ray  propagation.**  The  radiated  signals  are  presumed  to 
travel  along  a  small  number  of  paths  to  one  or  possibly  two  receivers 

I 

with  some  of  these  paths  reflecting  from  the  surface.  The  received 
signals  are  also  assumed  to  be  corrupted  by  broad-band,  additive 
Gaussian  background  noise. 

Attention  is  primarily  focused  on  evaluating  the  effect  of  the 
slowly  varying  irregular  surface  on  the  performance  of  correlator 
detectors  and  trackers.  Scatter  degradation  is  computed  as  a  function 

of  two  principle  design  parameters: 

I 

1.  Correlator  integration  time. 

2.  Signal  processing  bandwidth. 

The  technique  used  to  evaluate  performance  is  to  compute  the  mean  and 
variance  of  the  correlator  output  under  the  two  hypotheses  of  target 
present  and  target  absent.  The  general  nature  of  the  propagation 
geometry  is  considered  known. 

It  is  shown  that  slow  time  variations  in  the  irregularities  of  the 
surface  manifest  themselves  as  persistent  fluctuations  in  the  output  of 
correlators  operating  on  the  scattered  signals.  These  fluctuations  are 

I 
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noticeable  until  the  correlator  integration  interval  becomes  much 
larger  than  the  relaxation  times  that  describe  the  temporal  behavior 
of  the  surface.  Only  at  this  point  does  the  correlator  reliably 
"average  out"  these  variations  over  the  ensemble  of  possible  surface 
deformations.  Similar  fluctuations  are  also  exhibited  in  estimates  of 
such  parameters  as  target  bearing,  range  or  depth  which  are  derived 
from  the  correlator  output. 

The  most  distinguishing  feature  of  the  treatment  presented  in 
this  report  is  the  explicit  inclusion  of  the  time-variation  of 
scattering.  A  comprehensive  description  of  the  scatter  dynamics  or 
relaxation  mechanisms  becomes  necessary  when  the  scattering  is  strong 
enough  to  force  long  processing  intervals  in  persuit  of  reliability. 

A  distinction  is  drawn  between  long  and  short  term  cprrelator 
fluctuations. 

Of  course,  when  the  input  signal  to  noise  ratio  is  high  and  the 
scattering  weak,  short  integration  times  may  be  used.  The  correlator 
fluctuations  are  still  apparent,  but  the  physics  of  the  scattering  may 
now  be  considered  "frozen*1  for  the  duration  of  the  processing  interval. 
The  time  variation  of  the  scattering  then  becomes  important  only 
onsofar  as  it  describes  the  average  lengths  of  time  during  which  the 
signal  correlation  "fades"  and  remains  uniformly  weak,  the  smear  due 
to  drift  being  negligible.  The  "frozen"  model  is,  however,  just  a 
special  case  of  the  dynamic  or  time-varying  model. 

With  regard  to  the  signal  processing  bandwidth  it  has  been  noted 
that  higher  frequency  signals  tend  to  be  more  severely  decorrelated 
by  scattering.  Furthermore,  signal  to  noise  ratio  tends  to  decrease 
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with  increasing  frequency.  Therefore,  it  is  desirable  to  be  able  to 
set  the  working  bandwidth  at  a  value  which  allows  enough  usable  energy 
to  be  processed  while  rejecting  excess  background  noise  and  unusable, 
badly  decorrelated  signal.  This  value  generally  depends  on  the 
statistical  roughness  of  the  surface  irregularities. 

The  exact  definition  of  the  optimal  processing  bandwidth  is  more 
complex  for  array  processing.  It  is  interrelated  with  the  question  of 
the  placement  of  sensors  and  interpath  coherence.  For  if  two  sensors 
are  placed  in  close  proximity,  the  scattered  energy  received  by  them 
tends  to  come  from  nearly  identical  or  overlapping  scatter  ^acets. 

Even  heavily  scattered  signals  can  contribute  to  sensor  cross-correlation 
under  such  circumstances. 

On  the  other  hand,  it  is  a  well  known  fact  that  background  noise 
cross -correlation  at  widely  separated  sensors  tends  to  be  small.  This 
effect  is  often  used  to  obtain  increased  array  gain  while  maintaining 
simplicity  of  processor  design.  It  is  important  to  remember,  however, 
that  scattered  signal  correlation  also  drops  off  with  increasing 
separation.  For  very  large  separations  the  sensors  receive  signals 
scattered  by  completely  independently  positioned  and  oriented  facets. 

Three  basic  propagation  geometries  are  investigated: 

1.  Single-site  reception  of  direct  path  and  surface 
reflected  path  (Figure  1.0- la).  In  this  case 
the  signals  transmitted  over  the  two  paths  are 
assumed  to  be  separable  through  use  of  directional 
sensors. 

% 

2.  Two  site  reception  of  surface  reflected  paths 
(Figure  1.0- lb).  The  direct  paths  are  assumed  to 
be  suppressed  in  order  to  prevent  the  analysis 
From  becoming  unnecessarily  complicated. 
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3.  N-site  reception  of  surface-reflected  path  (Figure 
1.0-ld).  Again  the  direct  paths  are  suppressed. 

The  first  case  is  of  interest  in  range  and  depth  estimation  while  case 

(2)  (3)  is  important  in  bearing  estimation.  The  third  case  is 

examined  only  in  a  very  brief  manner  in  Appendix  J. 

Three  different  models  are  examined  for  surface  scattering: 

1.  Random  time-variable  delay  and  amplitude  modulation. 

This  model  is  not  particularly  realistic,  but  it  is 
suitable  for  determining  the  effect  of  the  time- 
variation  of  the  scattering.  (Chapter  4) 

2.  Randomized  one-dimensional  surface  corrugation 
scattering.  This  model  is  slightly  more  realistic 
than  model  (1)  while  remaining  reasonably  tractable. 

This  surface  is  perfectly  correlated  along  the  direction 
parallel  to  the  corrugation.  Randomization  of  the 
model  is  handled  by  stochastic  parameters.  (Chapter  5) 

3.  Two  dimensionally  irregular  surface  scattering.  In 
this  case  the  boundary  deformations  are  described 
statistically  by  a  two  dimensionally  stationary 
spatial  correlation  function.  (Chapter  6) 

Although  the  simpler  models  (1)  and  (2)  are  not  entirely  satisfactory 

representatives  of  real  physical  scattering,  the  results  derived  for 

them  do  share  certain  overall  similarities  the  fully  stochastic  model 

of  (3).  Moreover,  it  is  possible  to  obtain  some  results  in  closed 

form  for  the  simpler  models  which  are  at  the  time  of  this  writing 

unattainable  for  the  more  realistic  case.  Model  (3)  is  discussed  only 

briefly  . 

The  analyses  of  models  (2)  and  (3)  are  approximate.  The 

approximations  used,  however,  enjoy  a  fair  degree  of  popular 

* 

acceptance  in  the  literature.  To  some  extent  there  is  experimental 

12 

justification  for  this  optimism,  but  the  agreement  rapidly  becomes 
only  qualitative  when  the  restrictions  on  the  approximations  are 
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violated.  Formally,  the  results  of  this  report  apply  only  for  weak 
to  moderate  scattering  observed  in  the  far-field  and  at  angles  of 
grazing  large  enough  so  that  over- shadowing  and  multiple  scattering 
do  not  occur.  This  also  implies  that  most  of  the  scattering  conies 
from  the  vicinity  of  the  surface  near  the  specular  point.  It  is  also 
assumed  that  the  surface  displacements  and  slopes  are  small  and  the 
radii  of  curvature  are  large  compared  to  X.  This  in  turn  implies 
that  the  surface  roughness  is  very  slight  and  that  the  incident 
radiation  is  of  low  to  moderate  frequency. 

Although  all  of  the  applications  actually  analysed  in  chapters  4 
and  5  only  low-pass  filtering  is  examined,  chapter  3  is  sufficiently 
general  to  handle  narrowband  problems.  Applications  of  the  formulas 
in  chapter  3  to  the  very  narrowbandwidth  case  must,  however,  be 
prefaced  by  the  comments  in  chapter  2.  In  this  limit  frequency 
smearing  must  be  properly  taken  into  account. 

Additional  material  is  presented  in  chapter  2  which  is  'moortant 
in  applying  the  general  results  of  chapter  3  to  multiple  bounce  scatter 
propagation.  In  this  case  the  frequency  spreading  due  to  a  given 
reflection  interacts  with  that  of  successive  reflections. 
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1 . 2  A  Brief  Historical  oummary  of  Surface  Scattering  and  the 


Motivation  for  the  Current  Research 

It  is  not  the  purpose  of  this  report  to  present  a  comprehensive 

summary  of  the  development  of  the  theory  of  scattering  from  irregular 

13 

surfaces.  Beckmann  and  Spizzichino  have  provided  a  very  complete 

14 

bibliography  covering  publications  up  to  1963.  The  article  by  Lysar.ov 
compiles  a  listing  of  the  Russian  effort  up  to  1958.  Many  of  the  more 
recent  articles  on  scattering  are  refinements  of  these  earlier  treat¬ 
ments.  IJe  shall  confine  our  attention  here  to  a  discussion  of  some  of 
the  simpler  concepts  v/hich  are  building  blocks  leading  to  the  point  of 
view  developed  in  this  report. 

The  first  attempts  to  analyze  rough  surface  scattering  were  preceded 
by  a  long  period  of  study  of  diffraction  and  interference  effects  by 
Fresnel*'*  and  Fraunhofer . The  analysis  of  the  Fraunhofer  parallel  slit 
problem  demonstrated  the  relation  between  spatial  periodic  components 
describing  apertureb  in  a  screen  and  the  position  of  interference  maxima 
for  waves  passing  through  the  apertures. 

The  analysis  by  Payleigh*^  for  the  reflection  of  sound  from  a 
diffraction  grating  appears  to  be  the  earliest  attack  on  the  problem  of 
scattering  from  surface  deformations.  This  analysis  assumes  an  incident 
radiation  which  is  in  the  form  of  a  monochromatic  plane  wave.  The 
grating  is  assumed  to  be  in  the  form  of  a  periodic  corrugation  z,  with, 
for  example,  the  direction  of  the  corrugation  along  the  x-axis: 

C(x)  *=  Ux  +  A)  *  (1.2-1) 

where  A  is  the  spatial  period  of  the  corrugation.  The  incoming 
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radiation  Is  assumed  incident  along  the  x-axis  at  an  angle  ip^  .  The 
receiver  is  placed  in  the  infinitely  removed  far-field  observing  the 
reradiation  along  the  x-axis  at  angle  .  The  path  length  difference 
for  two  rays  connecting  the  source  and  receiver  which  impinge  on  the 
corrugation  at  similar  points  separated  by  A  (see  Figure  1.2-1)  is 
given  oy 


BC  -  AD  *  A  (Cos  ip^  -  Cos  tp^)  (1.2-2) 


Figure  1.2-1  The  Diffraction  Grating  Problem 
For  an  interference  maximum  of  reradiation  to  occur  at  an  angle  of  ip 

rm 

this  path  length  difference  should  be  a  multiple  of  the  wavelength  X 
so  that 

t 

Cos  ip  *  Cos  ip,  +  mX/A  (1.2-3) 

rm  l 

This  relation  is  called  the  grating  equation  and  the  angles  ip^  (for 

18 

m  “  0,  +1,  +2,  .,,)  are  termed  Bragg  angles.  From  the  requirement 
that  |Cos(ipra)|  £  1  it  is  seen  that  propagating  orders  exist  in  the 
reradiating  field  for  values  of  m  satisfying  tjae  inequalities 

-  j  (1  +  Cos  ipt)  <_  m  <  j  (1  -  Cos  <pi)  (1.2-4) 
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For  A/X  <<  1  only  the  0-th  order  propagates.  This  order  is  termed 

the  specular  order  or  mode.  For  high  frequencies  (small  values  of  X) 

many  propagating  orders  are  possible.  Orders  for  values  of  m  not 

satisfying  Equation  (1.2-4)  are  said  to  be  in  *utoff  and  they  correspond 

to  surface  waves  that  do  not  carry  energy  away  from  the  boundary. 

Formal  extension  of  path  length  difference  analysis  for  interference 

20 

to  two  dimensionally  periodic  surfaces  presents  no  difficulty. 

An  analysis  of  the  scattering  of  sound  from  two  dimensionally  random 

21 

surfaces  is  presented  by  Eckart.  This  treatment  clearly  demonstrates 
the  connection  between  the  interference  maxima  and  the  surface  spectrum. 
His  analysis  begins  with  the  well  known  Kirchhoff  integral  theorem  for 
monochromatic  reradiation: 


(a)  (1.2-5) 


where 

p  (x,y,z,t)  -  p  (x,y,z)  e  ^wt  (b) 

o  S 

a 

Here  p  is  the  reradiation  amplitude,  k  is  the  wave  number  2tt/X  , 

6 

S  is  the  scattering  surface,  r  is  the  distance  from  the  receiver  point 
P  to  a  point  on  the  surface,  and  3/3n  is  the  derivative  along  the 
outward  normal  of  the  surface  S  . 

Eckart  assumes  that  the  surface  S  is  illuminated  by  an  incident 
wave  generated  by  a  directional  monochromatic  point  source  with  a  finite 
beam-width : 

* 

jkr ' 

P1(x,y»z)  e  Bg(6 »$)  —  r<  (1.2-6) 
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Here  B  (6,$)  Is  the  beam  function,  r*  Is  the  distance  from  the  source 
c 

point  Q,  and  6  and  $  are  the  polar  and  azimuthal  angles  for  a 
spherical  coordinate  system  centered  at  Q  . 

The  boundary  condition  on  the  surface  S  is  assumed  to  be  a 
pressure  release  constraint: 


p.  +  p  -  0  [on  S]  (1.2-7) 

X  8 

After  making  a  series  of  approximations  which  are  explored  in  greater 
depth  in  Section  1.4,  Eckart  arrives  at  an  approximate  solution  to 
Equation  (1.2-5)  subject  to  Equations  (1.2-6)  and  (1.2-7)  given  by 


p8(p) 


a.  rrBe(M)  a 

Itvjj  rr"  Sz 

s 


[eJk(r+r^  dx  dy 


(1.2-8) 


Eckart  assumed  the  beam  pattern  to  be  narrow  enough  to  permit  lineariza¬ 
tion  of  the  path  length  r+r'  as  follows: 

r  +  r'  =  r  +  r  '  +  (a°x  +  b°y  +  c °t)  (1.2-9) 

0  0  8  8  8 


where  r  and  r  '  are  the  values  of  r  and  r'  near  the  center  of 
o  o 

the  illuminated  area  which  is  considered  the  origin  of  the  x,y,z 

coordinate  system  and  a°,b°,c°  are  respectively  the  sums  of  the  x,y,z 

8  6  6 

direction  cosines  of  r  and  r’  .  On  executing  the  partial  derivative 

o  o 

with  respect  to  z  in  Equation  (1.2-8)  and  replacing  the  reciprocal  of 
rr'  by  its  value  near  the  center  of  the  illuminated  area  we  have 


(jkc°)  Jk< ro+ro,) 


s'  e  00  rr  jk(a°x+b°y+c°0 

P„(p)  =  — 77 - -  /  I  Bje,$)  e  ^  dx  dy  (1.2-10) 

00  J  J 


An 


e 


where  c,  0,  and  <p  are  all  functions  of  x  and  y  .  The  result 
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shows  that  the  scattered  signal  ia  approximately  composed  of  a  super¬ 
position  of  replicas  of  the  incident  wave  which  are  randomly  delayed, 
attenuated  and  time  differentiated  (as  may  be  seen  from  the  factor  jk 

in  Equation  1.2-10).  The  time  derivative  should  not  be  surprising  since 

22 

the  result  is  in  the  form  of  a  density  over  arrival  times. 

Finally,  by  forming  the  mean  square  magnitude  for  p  (P)  Eckart 

8 

obtained  the  desired  relationship  between  the  reradiation  and  the 

statistical  properties  of  the  surface.  Assuming  the  surface  perturba- 

23 

tion  ;(x,y)  to  be  a  stationary  random  variable  in  x  and  y  we 
have  from  Equation  (1.2-10) 


p“s(p)  p6(p>* 


r  kc° 

■f=M  SI 


J«,n)  Q2(-kc°,kc°,e,n) 


(1.2-11) 


-jk(a°t+b°n) 

e  8  8  d£  dn 


Here  the  asterisk  denotes  the  complex  conjugate  and  the  over-bar 
indicates  ensemble  averaging.  The  function  Q2(u,v,£,n)  is  the  second 
order  characteristic  function  corresponding  to  the  two  sample  probability 
distribution  F[c(x,y) ,^(x+^,y+n) ]  : 
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JU»n)  *  JJ  Bet0(x,y),4>(x,y)] 

(1.2-13) 

B  t8(x+5,y+n)  ,$(x*K»y+n)]  dx  dy 

c 

The  region  R  in  Equations  (1.2-12)  and  (1.2-13)  is  the  infinite 
xy 

x-y  plane.  , 

Eckart  assumes  C  to  be  a  Gaussian  random  process  yielding 

-lj[u^h^+2uv4'  (£,n)+v^h^] 

Q2(u,v,£»n)  »  e  c  (1.2-14) 

where  h  is  the  mean  square  value  of  the  (zero  mean)  deformation  C  and 

^U.n)  -  c(x,y)  c(x+£,y+n)  (1.2-15) 

is  the  correlation  function  for  the  deformation  at  two  points  on  the 
surface  separated  by  displacements  £  and  n  .  The  dependence  of  V 
only  on  these  displacements  is  a  result  of  the  assumption  of  spatial 
stationarity  of  £  . 

By  substituting  Equation  (1.2-14)  into  (1.2-11)  Eckart  obtains 


ps(p>  ps(D* 


r  kc  i 
[4"r„>] 


2  2 

exp(-k  h  c 


s'’  // 


n) 


xy 


(1.2-16) 


2  02  -jk(a°£+b°n) 

exp {k  c°*  ^  (£,n)J  e  d£  dn 

8  s 


Ignoring  the  beam  pattern  convolution  function  J(£,n)  for  the  moment, 

it  is  seen  that  the  directional  and  frequency  selective  properties  are 

2  2 

determined  on  the  average  by  the  space  transform  of  exp[k  c°  ¥  (£,n)J  . 

8  s 

In  a  similar  manner  Eckart  obtains  the  mean  of  p  (P) 

s 


B-17 


(1.2-17) 


P8(P> 


Q,(kc°)  (• 
1  8 


°\  Jk(r  +r  ’) 
(jkc)  *  o  o 


An 


r  r 
o  o 


II 


xy 


Be(0,4») 


jk(a°x+b°y) 
e  dx  dy } 


where  the  quantity  in  braces  is  identical  to  the  mirror  reflection  term 

A 

p  (P)  for  i,  *  0  (as  may  be  seen  from  Equation  (1.2-10)  and  Q.  (u)  is 
8  1 

the  one  dimensional  characteristic  function  for  £  : 

Q1(u)  ■  exp(ju;)  (1.2-18) 

Again,  by  assuming  Gaussian  statistics  for  the  variable  i;  Eckart  finds 

-  -k2h2c°2/2  (1.2-19) 

PS(P>  -  IPs(P)]t.0  e 

The  mean  value  p  (P)  is  termed  the  coherent  reradiation  by  Eckart. 

s 

While  Equation  (1.2-10)  relics  heavily  on  certain  approximations,  there 

is  good  experimental  evidence  that  Equation  (1.2-19)  is  very  accurate 

24 

when  the  restrictions  on  the  approximations  are  not  violated. 

In  particular,  assuming  the  incident  beam  to  have  an  angle  of 
grazing  y^  and  examining  the  forward  coherent  reradiation  into  the 
specular  direction  defined  by 

a°  ■  0  (a) 

s 

b°  -  0  (b)  (1.2-20) 

s 

c°  =-  2  sin  tp  (c) 

s  1 
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we  have  from  Equation  (1.2-19)  the  coherent  reflection  coefficient  R  : 


R 


tPs(P»  2  2  2 

-  ■  cxp(-2  k  h  Sin  i|^) 


(1.2-21) 


It  can  be  seen  that  this  is  a  function  only  of  the  Rayleigh  parameter 
given  in  Equation  (1.0-2). 

Unfortunately,  if  we  choose  to  violate  restrictions  used  in  obtaining 
these  results  this  agreement  breaks  down.  For  example,  if  we  wish  to 
examine  a  passive  detection  problem,  a  narrow  beam  pattern  for  the  source 
at  the  target  is  unlikely.  If  we  set  B(0,$)  *  1  and  consider 
reradiation  from  a  flat  surface  (c-  0)  then  in  the  specular  direction 
defined  by  Equations  (1.2-20),  (1,2-10)  yields 


p  (P)  «  ! 

8 


(1.2-22) 


The  difficulty  can  be  traced  to  the  linearization  in  Equetion  (1.2-9). 
This  problem  is  examined  further  in  Section  1.4. 
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1.3  Scattering,  from  Time  Varying  Surfaces 


The  Extended  Klrchhoff  Integral  Equation 

The  Kirchhoff  formula.  Equation  (1.2-5),  which  forms  the  basis  of 
the  Eckart  analysis  applies  strictly  only  to  the  case  of  scattering 
from  time  invariant  surfaces.  Eckart  tacitly  assumes  that  if  the 
surface  motion  is  slow  compared  to  the  period  X/c  of  the  radiation 
then  Equation  (1.2-5)  still  applies  but  with  the  boundary 

C S ( t)  :  z-  ;(x,y,t)]  (1.3-1) 

which  is  explicitly  time  varying.  Eckart' s  analysis  of  the  mean  and 

mean  square  of  p  (P)  did  not  require  an  explicit  description  of  the 

8 

time  varying  behavior.  However,  we  are  primarily  interested  in  examining 
space-time  correlation  functions  and  the  temporal  fluctuations  in  their 
estimates.  Therefore,  a  few  qualifying  remarks  are  warranted  with  regard 
to  this  approximation. 

The  Kirchhoff  formula  is  essentially  a  restatement  of  the  well  known 
wave  equation  for  small  pressure  disturbance  in  isotropic  elastic  media 

V2p-±r1-§  d-3-2) 

c  3t 

in  terms  of  an  integral  equation  in  which  the  integral  is  over  the  surface 

S(t)  .  The  extension  of  the  basic  Kirchhoff  formula.  Equation  (1.2-5), 

to  the  case  of  time  varying  boundaries  was  carried  out  by  IJ.  R.  Morgans 
25 

in  1929.  We  now  briefly  review  his  solution. 

Mathematical  restrictions  placed  on  the  pressure  p  inside  the 
volume  V  enclosed  by  the  surface  S(t)  (see  Figure  1.3-1)  which  a ru 
necessary  for  the  extended  Kirchhoff  formula  to  hold  arc  equivalent  to 
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the  requirement  that  V  be  source-free.  Consequently,  the  solution  to 
Equation  (1.3-2)  is  broken  into  two  parts: 


p(t)  -  pt(t)  +  pg(t) 


(1.3-3) 


where  both  p  (t)  and  p  (t)  are  solutions  to  Equation  (1.3-2).  The 

X  6 

solution  p^(t)  is  that  due  to  the  source  of  the  radiation  but  in  an 
unbounded  medium,  while  p  (t)  is  the  solution  which  must  be  added  to 

3 

p^(t)  in  order  to  satisfy  the  boundary  condition  on  S(t)  .  It  is 

assumed  that  the  incident  radiating  field  p^(t)  is  known  so  that  there 

remcins  only  the  problem  of  obtaining  the  integral  equation  for  p  (t)  . 

s 


x-axis 


Geometry  for  Time  Variable  Surface 
in  the  Kirchhoff  Formula 

Figure  1.3-1 
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The  Kirchhoff  integral  equation  for  tine  varying  surfaces  is  obtained 

26  27  28 

in  a  manner  similar  to  that  ured  in  the  time  invariant  case.  *  *  One 

begins  by  considering  the  wave  equations  for  the  scattered  wave  p  and 

s 

an  auxiliary  solution  q  which  is  used  in  the  derivation: 


/  \  „2  1  3  Ps 

<a)  v  p8  *  i  rr 

c  3t 


(b) 


,2  1  a20 

q  ’Tli 

C  dt 


Multiplying  Equation  (1.3-4b)  by  p  ,  Equation  (1.3-4a)  by  q 

8 


taking  the  difference  we  have 


(1.3-4) 

and  then 


q  v 


ps  - 


V2q 


[y.(q  yp  )  -  7p  • Vq]  -  [7*(p  7q)  -  Vq*Vp  ) 

o  S  8  S 


■  V*[q  7pfl  -  pg7q] 


(1.3-5) 


32p 


s 


-  P. 


1-1  .  L. 


3t 


3t 


3t 


U 


3t 


-  P 


ll 

s  3t 


We  now  consider  the  point  P  and  the  time  tQ  at  which  we  ultimately 

desire  the  value  of  p  .  Following  Eckart's  notation  we  let  r  denote 

s 

the  distance  from  ?  .  We  surround  P  with  a  small  sphere  S’  (see 
Figure  1.3-1)  and  designate  the  volume  below  S(t)  and  outside  S'  as 
V'  .  Integrating  Equation  (1.3-5)  over  V'  and  using  Green's  Theorem 
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Next,  the  derivation  of  the  Kirchhoff  formula  requires  the  integra¬ 
tion  of  Equation  (1.3-6)  with  respect  to  t  and  it  is  at  this  point  that 
the  time  variation  of  S(t)  must  be  taken  into  account.  This  may  be 
done  by  using  the  integrated  form  of  the  Stoke' s  total  derivative.  For 
any  function  f(x,y,z,t) 

H  fffC  dV  ■  ffj If  dV  +  //  f  v„  dS  (1-3-7) 

V'  V  set) 

where  v^  is  the  velocity  of  an  element  of  the  surface,  dS,  taken  along 
the  outward  normal,  n  . 

Applying  Equation  (1.3-7  to  (1.3-6)  and  integrating  t  over  -® 
to  +*  we  have 


ps  ft3  dV 


t«+«° 

t*-00 


The  volume  integral  in  Equation  (1.3-8)  may  be  made  zero  by  forcing  q  to 
be  zero  as  t  •+  +»  .  On  specializing  q  to  be  the  function 


q(r,t)  =  W(ct-cto+r) /r  (1.3-9) 

where  W(£)  is  a  very  narrow  pulse  centered  around  £  =  0  ,  and  which 
is  normalized 


/ 

—00 


W(£)  d£  =  1 


(1.3-10) 
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it  becomes  possible  to  evaluate  the  integral  over  S’  .  If  this  sphere 
is  centered  on  P  and  shrunk  in  size  then  the  integral  over  S'  becomes 


~~~  (-1/r^)  j  W(ct-cto+r)  p<s(r*0,t)  c  dt  +  o(r) 


8 


+  ~  p  (P,t  ) 
c  S  O 


(1.3-11) 


Equation  (1.3-8)  then  reduces  to 


+® 


ill 


dx  dy 


xy 


(1.3-12) 


</  ^  +  d‘> 
J  c  c 


where  H(x,y,t)  is  the  secant  of  the  angle  to  the  z-axis  of  n(x,y,t): 


(1.3-13) 


There  remains  now  only  the  problem  of  performing  the  integral  over  t 
in  Equation  (1.3-12).  The  details  of  this  manipulation  are  presented  in 
Appendix  A,  The  result  is 


/ t\  .  \  ((  dx  dy  ,  H  ,3Ps  .  Vn 

;(P,t  "  JJ  l7(l+f/c)  {?T  +  _2 


£« 

at 


R 


9r  rVn  ] 

'  C2  lp e 
(1+r/c)  r 


xy 


(1.3-14) 


+  i-  d_[  (it  +  III)  H  P5„ 

He  dtl'9n  c  '  11 


(1+r/c)  »  *-C(x,y,t) 
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The  integrand  is  evaluated  at  the  retardation  time  t  -r/c  where  r  is 

o 

the  distance  to  the  portion  dS  of  the  surface  lying  above  dx  dy  when 

it  is  positioned  to  give  a  disturbance  arriving  at  P  at  tQ  .  The 

determination  of  r  for  a  given  x,y,tQ  is  itself  a  formidable  problem. 

Ignoring  terms  of  the  order  v  / c,  r/c,  v  /c,  H/c  and  (3r/3n)/c 

n  n 

which  are  all  quite  small  for  the  problems  of  interest  in  this  report  we 
see  that  Equation  (1.3-14)  becomes 


ps(P>to) 


// 


dz  dy  rH  .  3Ps  ,  3r  Ps  .  1  3r  3ps  . . 
4n  lr  *3n  3n  r  c  3n  3t  * 


(1.3-15) 


xy 


t*t  -r/c,z-;(x,y,t) 
o 


We  desire  a  comparison  of  this  result  with  the  Kirchhoff  formula  for 
monochromatic  radiation  Equation  (1.2-5).  Therefore,  we  assume  that  the 
scattered  field  which  varies  due  to  the  surface  motion  is  quasi- 
raonochromatic  or  narrow  band: 

P_(x,y,z,t)  ■  p  (x,y,z,t)  e’^  (1.3-16) 

S  D 

where  p  (x,y,z,t)  is  the  slowly  varying  (complex)  amplitude.  Inserting 

3 

this  into  Equation  (1.3-15)  we  have 


*.<P-eo> 


dx  dy.H 
4  it  r 


P 

xy 


1 3n 


3r  ps  t  1  3r  . 3Ps 
3n  r  c  3n  l3t 


-  jw 


pJ}]eJkr  (1.3-17) 

t*t  -r/c 
o 

z=C(x,y,t) 


We  assume  that  the  slow  variations  of  the  amplitude  satisfy  the  inequalit 
suggested  by  Eckart: 


(1.3-18) 
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Taking  advantage  of  this  we  find 


Recalling  that  ds  ■  H  dx  dy  we  see  that  the  result  is  indeed  similar  to 
the  ti\e  invariant  Kirchhoi.  formula  as  Eckart  suggests,  but  in  some  cases 
it  is  important  to  remember  the  need  for  the  retardation  time  in  the 
integrand.  This  occurs  when  differences  in  retardation  times  within  the 
active  scattering  area  become  comparable  to  the  characteristic  period  of 
the  surface  time  variations. 
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1 . 4  The  Approximate  First  Order  Solution  of  the 

Kirchhoff  Integral  Equation 

In  this  section  we  examine  more  closely  the  transition  from  the 
Kirchhoff  formula,  Equation  (1.2-5),  to  Eckart's  approximate  solution, 
Equation  (1.2-6).  We  also  examine  the  works  of  a  few  other  authors  which 
either  contrast  with  the  Eckart  approach  or  improve  it. 

From  the  Dirichlet  or  pressure  release  condition  assumed  by  Eckart 
we  have 


Pg(x,y,z,t)  -  “P^(x,y  ,z)  [z-c(x,y  ,t)  ] 


(1.4-1) 


Inserting  this  directly  into  Equation  (1.3-19) 

*  Ikr  jkr 
9  ,eJ  v  .  eJ 


hIJ 


Pi(*,y,2>  jjP-r-)  + 


3_ 

3n 


S  (t  ) 
r  o 


p  (x,y ,z,t  -r/c)  dS 

S  0 

(1.4-2) 


where  sr.(t0)  is  the  appropriately  retarded  surface.  Unfortunately,  the 

A 

partial  derivative  of  p  with  respect  to  n  in  the  integrand  of 

s 

Equation  (1.4-2)  is  unknown  a-priori,  and  it  cannot  be  assigned  in  an 
independent  manner  from  the  constraint  Equation  (1.4-1).  Therefore, 

A 

although  p^  can  usually  be  found  quite  easily,  the  problem  of  solving 

the  integral  equation  (1.4-2)  for  p  remains  a  difficult  problem. 

s 

Currently,  the  only  serious  attempt  to  solve  integral  equations  of 

29 

the  form  (1.4-2)  in  a  straightforward  manner  is  given  by  Uretsky.  His 

analysis  considers  only  the  case  of  a  plane  wave  reflected  from  a  time 

invariant  sinusoidal  corrugation.  The  technique  assumes  that  the  normal 

derivative  of  p  is  periodic  on  the  surface  along  the  direction  of  the 

s 

corrugation  and  a  Fourier  expansion  is  applied.  This  results  in  an 
infinite  set  of  equations  in  an  infinite  number  of  coefficients.  Uretsky 
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obtains  the  first  few  of  these  coefficients  using  approximate  numerical 

methods.  This  technique,  however,  does  not  lend  itself  to  the  problem 

of  scattering  from  arbitrary  or  random  boundaries. 

Other  authors  such  as  Rice,30  or  Marsh,  Schulkin  and  Kneale31 

begin  by  extending  the  Rayleigh  approach  to  the  case  of  random  surfaces. 

They  assume  a  plane  wave  expansion  for  p^  with  an  unknown  but  position 

invariant  space  spectral  density.  These  authors  attempt  to  match  the 

boundary  condition  (1.4-1)  without  making  explicit  reference  to  the 

Kirchhoff  formula.  This  technique,  however,  also  leads  to  an  infinite  set 

of  equations  for  various  components  of  the  unknown  spectral  density  and 

the  solution  requires  conditions  that  imply  weak  scattering. 

Although  the  plane  wave  expansion  approach  appears  to  be  more 

attractive  than  the  Uretsky  technique  from  the  point  of  view  of  flexibility 

32 

it  suffers  from  a  serious  defect.  Lord  and  Murphy  show  that  the  assump¬ 
tion  of  a  constant  spatial  density  of  plane  waves  is  in  contradiction  with 

33 

the  Kirchhoff  formula.  This  difficulty  was  first  noted  by  Lippmann  and 

later  explained  by  Meecham  in  an  article  predating  the  efforts  of  Marsh, 

Schulkin  and  Kneale.  Meecham  examines  the  problem  of  scattering  from 

corrugations.  Using  an  argument  which  is  easily  generalized  to  two 

dimensionally  random  surfaces  (see  Appendix  A)  Meecham  shows  that  although 

a  plane  wave  expansion  exists,  the  density  of  the  expansion  becomes 

position  dependent  in  the  neighborhood  of  the  surface  deformations.  This 

fact  prevents  the  determination  of  the  far-field  wave  density  from  the 

boundary  condition  Equation  (1.4-1). 

35 

Alternatively,  Mintzer  proposes  a  modification  of  Eckart’s  analysis 
which  is  essentially  a  perturbation-iteration  scheme.  This  results  in  a 


very  flexible  solution  which  is  applicable  for  time  varying  random  sur¬ 
faces  and  which  is  theoretically  more  accurate  than  that  obtained  from 
plane  wave  expansions.  In  this  treatment  Equation  (1.4-2)  is  rewritten 
as  follows: 


s 


«•*.>  •  47  JJ  f7(£V  ds 


S  (t  ) 
r  o 


(1.4-3) 


6  ff  e^kr  a  , 

47  JJ  ~  37(ps(x,y>z>to‘r/c)  *  Pi(x-y-^)  ds 


W 


where  6  is  a  parameter  of  "smallness"  which  is  1  for  the  physical 
problem  cf  interest.  However,  we  consider  the  general  problem  of  solving 
Equation  (1.4-3)  with  arbitrary  6  <_  1  and  use  6  to  drive  the 
iterative  solution  process. 

The  choice  of  the  second  term  as  the  additive  perturbation  is 
motivated  by  the  fact  that  it  is  identically  zero  for  a  flat,  perfectly 
reflecting  surface.  In  this  case  the  boundary  acts  as  a  plane  of 
syranetry  or  "mirror"  surface  which  reflects  "images"  of  the  sources 

A 

generating  p^  .  This  term  is  also  known  to  be  small  when  the  amount  of 

35 

overshadowing  of  the  surface  S  on  itself  is  slight.  It  should  be 
noted  that  multiple  scattering  and  other  effects  (such  as  a  certain 
amount  of  diffraction)  are  also  included  in  this  term.  Nevertheless,  we 
broadly  refer  to  this  term  as  the  "shadowing"  term  even  when  the  frequency 
of  radiation  is  too  low  to  cast  distinct  shadows.  The  condition  of  slight 
overshadowing  requires  generally  large  angles  of  grazing  and  small  surface 
slopes . 
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The  standard  perturbation-iteration  approach  to  the  solution  of 

(1.4-3)  Is  to  assume  that  the  solution  can  be  grouped  into  tenu9  of  de- 

36 

creasing  order  of  magnitude  in  5: 


(1.4-4) 


Substituting  (1.4-4)  into  (1.4-3)  and  collecting  terms  of  equal  order  in 
6  we  have 


(1.4-5 

(a) 

(b) 

(c) 


Thus,  in  principle  the  first  iterative  solution  is  determined  by  the  in- 

A 

coming  or  free-space  solution  p^,  and  each  successive  iteration  is  obtained 

recursively  from  the  next  lower  solution.  The  techniaue  is,  however, 

limited  by  two  drawbacks.  First,  there  is  no  assurance  that  the  expansion 

in  (1.4-4)  converges  with  6-1  for  a  given  boundary  or  frequency.  Second, 

A(0) 

even  the  integral  for  p  cannot  be  performed  exactly  owing  to  the 

s 

complexity  of  its  integrand,  especially  for  P  in  the  near  field. 

Despite  these  limitations  the  perturbation-iteration  method  enjoys 
widespread  acceptance  in  the  literature.  Frequently  the  iteration  is 


pe0)  •'■Vn  f  I  hi  4~  pi(x>y>2)) 


dS 


S  (t  ) 
r  o 


s”  (P'C0>  ‘dff  4_fe(P80)(Xiyi2>Vr/c)*1>l(X>y>O) 


dS 


S  (t  ) 
r  o 


P.(m)(p,t  >  1 


a  •  -  o'  4ir  //  4-k(Pi”"1)(x'y’Z’to-r/c))  dS 


s  (t  ) 

r  o 


(for  m  >_  2] 
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stopped  after  only  the  first  term  Is  obtained: 


:  :(0) 


8 


(1.4-6) 


This  approximation  which  is  exact  for  the  flat  surface  tends  to  yield  the 

37 

correct  frequency  behavior  even  for  mild  scattering.  It  is  in  this 
approximation  that  Eckart  obtained  his  results.  We  shall  limit  ourselves 
in  this  report  to  the  situations  for  which  (1.4-6)  is  sufficiently  accurate. 

We  now  proceed  to  evaluate  (1.4-5a).  The  normal  derivative  is 
conveniently  computed  in  operator  form  as  follows: 


3n 


n  *V 


K  i_  _  !_  +  L_ 

3x  3x  3y  3y  3z 
H(x,y,t) 


(1.4-7) 


where  H(x,y,t)  is  the  secant  of  the  angle  between  n  and  the  vertical 
and  3?/3x  and  3<;/3y  are  the  x  and  y  slopes  of  the  surface 

Next  we  assume  a  directional  point  source  located  at  the  point  Q 
(see  Equation  (1.2-6)).  Examining  the  normal  derivative  in  (1.4-5a)  we  have 


ejk(r+r’) 

rr '  J 


Jk(r+r') 

C 

rr' 


3Be(0,$) 

3n 


+  Be(0,4>)  [jk(r+r ' ) ] 


1  3r 
r  3n 


1  3r\ 

r*  3n  ] 


(1.4-8) 


We  consider  only  the  case  of  wide  beam  patterns  so  that  3B  (0,<j>)/3n  can 
be  ignored,  and  we  limit  ourselves  to  the  far  field  where 

k  »  •  (1.4-9) 

r  r 

Hence,  (1.4-5a)  becomes  with  the  use  of  (1.4-7)  and  (1.4-8)  together 
with  the  relation  dS  =  dx  dy  H 
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A 


p.(P-to) 


dx  dy 
4n 


Jk(r+r»)/_ 


rr 


f“i£  Lrll  1_  Mkfr+r 
[  dx  dx  dy  dy  dz )  ^k(r+r 


')]]  (1.4-10) 

-r/c 

o 

z*C(x,y,t) 


Following  Gulin  and  x,y,z  coordinate  system  is  selected  in  such 
a  way  that  both  the  source  point  Q  and  the  point  of  observation  P  lie  in 
the  plane  defined  by  y-0,  as  shown  in  Figure  1.4-1.  The  origin  of  the 
coordinate  system  is  chosen  to  lie  in  the  plane  C  ■  0  (the  plane  defined 
by  mean  of  the  random  deformations)  and  to  be  located  at  the  specular 
point  for  wide  beamwidth  incident  radiation.  The  beam  pattern  is  here 
assumed  to  be  aimed  directly  at  the  specular  point.  This  point  can  be 
determined  by  finding  the  intersection  of  the  plane  0  and  the  line 
PQ'  where  Q'  is  the  flat  surface  image  point  for  the  source. 


Figure  1.4-1 
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A  key  assumption  which  must  be  made  in  order  to  evaluate  (1.4-10)  is 

that  the  most  important  region  of  the  integration  is  in  the  vicinity  of 

the  specular  point  even  when  the  beam  pattern  is  not  very  narrow. 
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Beckmann  observes  that  while  most  of  the  integrand  varies  slowly  over 
the  surface,  the  exponent  is  very  oscillatory.  The  most  important  con¬ 
tribution  to  the  Integral  comes  from  within  the  first  Fresnel  zone  of 
this  spatial  oscillation.  The  larger  higher  order  Fresnel  zones  become 
more  closely  spaced  and  produce  successively  smaller  contributions  of 
alternating  sign  which  tend  to  cancel.  Provided  the  angle  of  grazing  M' 
for  the  Incident  beam  is  large,  the  first  zone  is  rather  circular  and 
centered  at  the  specular  point.  At  lower  grazing  angles  the  zones  become 
ellipses  elongated  along  the  x-axis  so  that  for  wide  beam  patterns  there 
are  significant  contributions  made  far  from  the  specular  point. 

For  large  grazing  angles,  overshadowing  is  not  important  and  the 
localized  nature  of  the  active  scattering  region  eliminates  the  need  for 
using  different  retardation  times  in  the  integrand  of  (1.4-10).  It  also 
becomes  permissible  to  expand  r  and  r’  about  the  specular  point.  Retaining 


terms  to  quadratic  order  in  x/r,  x/rf,  y/r,  and  y/rf  in  the  exponent  of 

(1.4-10)  we  obtain  (see  Appendix  C)  a  result  which  is  slightly  more 
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accurate  than  that  given  by  Gulin  for  scattering  into  the  specular  direction 


(1.4-11) 
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Here  a(x,y),  b(x,y),  c(x,y)  are  the  x,y,z  direction  cosines  for  r  and 

<  » 

a'(x,y),  b'(x,y),  c'(x,y)  are  the  x,y,z  direction  cosines  for  r'.  The 
subscript  "s"  on  a,  b,  or  c  denotes  the  sum  of  the  primed  and  unprimed 
quantities  and  the  superscript  "o"  denotes  evaluation  of  these  quantities 


at  the  specular  point.  The  distances  r  and  r'  are  the  distances  from 

o  o 


the  specular  point  to  P  and  Q  and 


(1.4-12) 


Now  at  the  specular  point  a  ,  b  ,  and  c  satisfy  Equation  (1.2-20). 

"  S  8  8 

Assuming  that  these  quantities  do  not  vary  appreciably  in  the  active 
region  of  scattering  (see  Appendix  C)  we  have  Gulin's  result 


jk  c°  eJk(ro+ro>  r,  ~ . 

Ps(P,to)  -  - 5 - J  I  dx  dy  B  (8,40 

4tt  r  r '  J  )  e 

o  o 


(1.4-13) 


eJk[y2/Re  +  £(x  c°)2/Re  -  c°c(x,y ,tQ-ro/c) ) 


Finally,  we  examine  the  special  case  of  perfect  reflection  and 

wide  boamwidth.  Setting  C  0,  B  (0,4>)  *►  1 
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(1.4-14) 


The  result  is,  of  course,  exact  for  all  frequencies  in  spite  of  the  fact  that 
at  low  frequencies  (large  X)  the  Fresnel  zones  are  large.  It  is  seen 
that  Gulin's  solution  is  superior  to  Eckart's  for  passive  detection 


problems  because  Eckart's  solution  diverges  in  this  limit. 
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CHAPTER  2  • 


PRELIMINARY  DISCUSSION  OF 
SIGNAL  AND  SYSTEM  PROPERTIES 


2*0  Introduction 

In  this  chapter  the  r-roperties  assumed  for  the  signals  and  systems 
used  in  this  report  are  summarized .  Both  signals  and  corrupting  background 
noises  are  taken  to  be  stationary  random  variables  although  for  the  most 
part  only  wide-sense  stationarity  need  be  assumed.  Since  the  fluctuation 
of  the  output  of  finite-time  correlators  is  examined  in  Chapter  3  it  is 
desirable  to  assume  that  the  background  noise  is  Gaussian.  This  simplifies 
certain  fourth  order  moments  that  arise  in  the  computation  of  the  variance 
of  the  fluctuation.  For  a  similar  reason  it  is  desirable  to  assume  that 
the  target  signals  are  Gaussian  before  they  are  scattered. 

The  scattering  mechanisms  are  assumed  to  be  in  the  form  of  random, 
linear,  time-varying  systems.  The  question  of  the  various  degrees  of 
stationarity  of  these  systems  required  for  different  problems  is  examined. 
Several  equivalent  representations  for  the  effect  of  scattering  are 
discussed.  Each  representation  is  useful  in  various  applications.  Some 
attention  is  focused  on  the  problem  of  cascading  two  or  more  random, 
linear,  time-varying  systems.  This  problem  becomes  important  in  tht. 
processing  of  signals  and  in  the  representation  of  multiple  scattering 
when  the  frequency  smear  of  a  particular  scattering  system  is  comparable 
to  the  bandpass  properties  of  systems  following  it  in  the  cascade 
arrangement.  While  the  emphasis  of  the  remaining  chapters  is  Dlaced  on 
wide  band  detection,  these  latter  results  arc  needed  for  analysis  or 


narrowband  detectors . 


2 . 1  Concerning  the  Nature  of  the  Target  Signal 

We  are  restricting  out  attention  to  the  passive  detection  of  targets 

which  generate  real,  stationary  Gaussian  signals  of  zero  mean.  Consider 

such  a  signal,  x(t)  for  <  t  <  •  .  It  is  well-known  that  x(t)  can 
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be  expressed  in  terms  of  a  Fourier  Transform  written  in  Stieltjes  form 

'h 

as  follows 


x(t) 


exp(Ju>t)  dz(u>) 


u 


(2.1-1) 


where  z(u)  is  termed  the  spectral  distribution  corresponding  to  x(t)  . 
From  the  real  nature  of  the  signal  x(t)  it  is  seen  that  the  increments 
of  the  distribution  possess  conjugate  odd  symmetry: 

-dz(-w)  ■  dz(fa))*  (2.1-2) 

Using  (2.1-1)  one  may  write  the  correlation  between  the  value  of  x 
at  two  times  t  and  t' 

R(t , t ' )  *  x(t)  x(t')*  -  j j  exp(jfa)t-jfa)'t ’)  dz(m)  dz(fa)')*  (2.1-3) 

w  fa)'  (2n)^ 

where  the  overbar  denotes  ensemble  averaging.  By  the  hypothesis  of 
statistical  stationarity 

R(t,tf)  -  R(t-t')  (2.1-4) 

This  functional  dependence  of  the  signal  correlation  on  only  the  time 


Following  a  convention  used  in  the  literature,  the  absence  of  limits 
on  an  integral  denotes  integration  throughout  the  range  of  the  non-zero 
integrand  while  the  symbolic  label  beneath  the  integral  sign  scans  the 
domain  [-*,»]  in  a  positive  sense. 
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displacement  t-t'  suggests  a  relation  which  can  be  written  symbolically 
as  follows : 


dz(w)  dz(u')*  ■  2tt  dZ(w)  6 (ca— u> * )  dco' 


(2.1-5) 


where  5(w)  is  the  Dirac  delta  function  and  Z(w)  is  the  power  spectrum 
distribution  function  corresponding  to  R(t)  : 


R(t) 


exp(jwT) 


dZ(u) 


(2.1-6) 


The  random  process  z(u»)  is  said  to  have  orthogonal  increments  when 
Equation  (2.1-5)  is  satisfied. 

Although  this  argument  is  rather  loosely  presented  and  its  conclusions 

somewhat  tersely  stated,  these  results  are  derived  in  a  more  rigorous 

A3  44  45 

manner  by  Kolmogorov,  Doob,  and  Yaglora.  The  distribution  function 

Z(u>)  can  be  shown  to  consist  of  three  components;  a  continuous  component, 

a  component  with  discrete  jump  discontinuities,  and  a  continuous 
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component  with  a  derivative  which  vanishes  almost  everywhere.  Since 
the  distribution  z(m)  is  a  random  variable  it  can  be  very  discontinuous 
for  certain  ensemble  realizations.^  These  discontinuities  in  z(uj)  may 
exist  in  individual  realizations  eren  when  Z(u>)  is  completely  continuous 
or  when,  in  fact,  2(w)  is  differentiable  and  possesses  a  spectral 
density: 


S 

xx 


(w) 


dZ(tu) 

dm 


(2.1-7) 


When  this  spectral  density  exists  Equation  (2.1-5)  may  be  rewritten  as 
follows : 


B-37 


(2.1-8) 


dz(w)  dz(o)')*  ■  2tt  S^Ccj)  5(w-w')  dw  du> '  . 

The  distribution  z(w)  is  determined  to  within  a  constant  by  the 
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inverse  of  Equation  (2.1-1). 

T 

«<«>-*£  f  SEUaSLil  «<t)  dt  .  (2.1-9) 

-T 

Since  the  mean  of  x(t)  is  zero,  using  this  definition  we  have 

71 u)  -  0  .  (2.1-10) 

Given  any  finite  collection  of  x-samples,  x  for  i*l,2,...,N  we 
have  the  following  probability  density  as  a  consequence  of  the  Gaussian 
hypothesis 

f(x)  *  * . .  exp(  xT  R  *  x  )  (2.1-11) 

V(2it)N  |r| 

T 

where  x  is  row  vector  (x^^, . . .  ,x^)  and  R  is  the  matrix  of 
T 

elements  jc  x  .  The  characteristic  function  or  moment  generating 
function  corresponding  to  this  density  is  then 

Q(u)  ■  exp(-j  xT  u)  ■  exp(  u^  R  u  )  (2.1-12) 

It  is  easily  seen  that  any  linear  combination  of  x  samples  is  also  a 
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Gaussian  random  variable.  In  particular,  the  spectral  distribution  is 
a  continuous  linear  combination  of  x  values  as  shown  by  Equation 
(2.1-10)  and  it  is  therefore  Gaussian. 

The  Gaussian  nature  of  z(u)  can  be  used  to  simplify  various  higher 
order  moments  of  the  increments  dz(u>)  .  Since 
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dz(w)  ■  0 


(2.1-13) 


odd  order  moments  of  the  form 

dz(fai)  dz(uj')*  dz(w")  (2.1-14) 

49 

vanish  identically.  Furthermore,  higher  order  even  moments  can  be 

expressed  in  terms  of  linear  combinations  of  products  of  second  order 
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moments.  For  the  fourth  order  moment  we  have 


(2.1-15) 

dz(oi)  dz(w")  *  dz(w’")*  dz(w')*  + 
dz(u>)  dz(w"')*  *  dz(u>')*  dz(uj") 

The  second  order  moments  of  the  first  and  third  term  in  Equation 
(2.1-15)  may  be  simplified  by  direct  application  of  Equation  (2.1-3). 

To  reduce  the  second  term  we  invoke  Equation  (2.1-2): 


and 


dz(w)  dz(u>")  •*  -dz(w)  dz(-wM)* 
■  2tt  S  (w)  6(u>+u)")  du>  do)*' 

XX 


(2.1-16) 


dz(w,n)*  dz(u>')*  ■  -dz(-w,M)  dz(w’)* 
-  2tt  S  (a)’")  6(u)M,+u)’)  dwM'  dw* 

XX 


(2.1-17) 
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Therefore,  (2.1-15)  becomes 


dz(b>)  dz(wf)*  dz(u)M)  dz( w'")*  - 
-  (20^  Sfw)  (6(u-u')  S  (u")  + 

A  A 

6(wW)  S  (w'")  6(w”'-Hd')  + 

X3C 

6(u)-a>n')  S  (u')  6 (u> *  — u)*') )  dw  doj *  dw”  dw,M 

XX 


(2.1-18) 


Finally,  by  entirely  analogous  reasoning  wc  may  represent  any  pair 
of  jointly  stationary  random  variables  by  using  orthogonal  increments: 

y(t)  j  e^wt  dzy(w)  ;  x(t’)  m  2n  J  C  dzx(u>')  (2.1-19) 

W  u' 

where 


dz  (w)  dz  (w’)*  »  2ir  dZ  (w)  6(w-w')  dw' 
x  y  xy 


(2.1-20) 


and  where  Zx^(w)  is  the  cross-spectral  distribution  corresponding  to 
the  cross-correlation  function  for  x  and  y  : 


R  (0  -  x(t)  y(t-T) 

xy 


-  2 ,/ 


exp(jwx)  dZ  (w) 
xy 


U) 


(2.1-21) 


Assuming  Z  (w)  to  possess  a  spectral  density  S  (w)  : 
xy  xy 

S  (w)  «*  dZ  (w) 
xy  xy 

dw 


(2.1-22) 


we  have,  following  Equation  (2.1-18) 


B-40 


(2.1-23) 


B 


2 . 2  System  Function  Description  for  Linear  Time  Varying  Filters 


As  indicated  in  Chapter  1  a  truly  accurate  and  complete  description 
of  the  effect  of  random  surface  scattering  on  reflected  signals  is  very 
difficult  to  obtain.  Perhaps  the  only  obvious  remark  which  one  can 
initially  make  with  some  assurance  is  that  at  the  power  levels  commonly 
encountered  in  passive  detection  the  phenomenon  of  surface  scattering  is 
linear.  This  is  simply  due  to  the  linearity  of  the  basic  wave  equation 
(1.3-2).  Armed  with  this  knowledge  alone  we  may  draw  several  very 
interesting  conclusions  with  regard  to  optimal  and  sub-optimal  receiver 
design.  However,  any  effort  to  compute  parameters  in  such  designs  or  to 
arrive  at  some  sort  of  understanding  of  the  degree  of  the  optimal  ity 
attained  ultimately  requires  the  construction  of  a  realistic  scattering 
model  from  the  underlying  physics. 

In  this  section  we  examine  some  of  the  well-known  properties  of 
surface  scattering  that  are  due  to  linearity.  In  particular,  we  study 
extensions  of  the  techniques  used  by  Ellinthorpe  and  Nuttal,^  Linden- 
laub,^*  and  Kailath^  to  describe  the  dynamic  input-output  relationships 
for  Linear  Time-varying  Filters  (hereinafter  referred  to  as  LTVF's). 

This  notation  serves  as  a  vehicle  for  stating  those  properties  of  scatter¬ 
ing  which  one  can  advance  without  making  reference  to  a  specific  physical 
model.  The  eventual  goal  is,  however,  to  find  approximate  physical 
counterparts  for  the  notational  conveniences. 

We  recall  that  the  outputs  y(t)  of  a  LTVF  can  be  related  to 
their  corresponding  inputs  by  a  linear  functional  relation  of  the  form 

y(t)  ■  j  h(o,t)  x(t-a)  da  (2.2-1) 

a 
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In  particular,  we  have  the  following  function  pair  which  satisfies  this 
mapping 


x(t)  -  6<t-to) 


y(t)  -  h(t-to,t) 


(2.2-2) 


which  identifies  h(o,t)  as  the  response  of  the  LTVF  at  time  t  due 
to  a  unit  strength  impulse  applied  o  units  of  time  previously.  The 

parameter  o  is  therefore  an  "elapsed-time"  or  memory  variable.  For  a 

given  value  of  a,  h(o,t)  fluctuates  slowly  as  a  function  of  t  for  the 

class  of  scattering  systems  of  interest  in  this  report.  Formally  we 

identify  these  slow  fluctuations  with  the  surface  motion.  On  the  other 
hand,  for  fixed  t  the  variation  of  h(o,t)  with  o  is  much  more  rapid 
and  generally  of  short  duration. 

By  substituting  v  *  t-a  into  Equation  (2.2-1)  we  obtain  a  slightly 
different  but  equivalent  representation  of  the  input-output  relation: 

(2.2-3) 


y(t) 


J  h'(v. 


t)  x(v)  dv 


In  this  representation  the  new  weighting  function  h’(v,t)  is  the 
reflected  and  translated  impulse  response 


(2.2-4) 


For  this  representation  we  have  the  function  pair 


x ( t)  -  6(t-tQ) 


(2.2-5) 


y(t)  =  h’(to,t) 

Therefore,  hf(v,t)  is  the  impulse  response  at  time  t  due  to  excita¬ 
tion  at  time  v  .  Whereas  o  is  a  relative  time  variable,  the  parameter 
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v  is  an  absolute  tine.  Although  the  easy  Identification  of  the  "fast" 
and  "slow"  axes  Is  an  attractive  feature  in  the  unprimed  impulse  response, 
Lindenlaub  Indicates  that  for  certain  applications  the  primed  notation  is 
more  advantageous. 

We  now  consider  three  classes  of  Fourier  transforms  on  the  primed 
and  unprimed  weighting  functions;  primary  axis  transforms,  secondary  axis 
transforms,  and  dual  or  combined  transforms: 

A)  Primary  or  First  Axis  Transforms 

In  both  the  primed  and  unprlmed  systems  of  notation  the  response  of 
the  LTVF  to  sinusoidal  excitation  assumes  a  role  of  importance.  This  is 
primarily  due  to  the  fact  that  the  sinusoidal  response  is  frequently  more 
readily  accessible  from  physical  considerations  than  the  impulse  response. 
Moreover,  the  approximations  used  in  the  analysis  of  scattering  tend  to 
be  better  at  some  frequencies  than  at  others.  Thus,  we  define  primary 
axis  transforms  H(w,t)  and  H'(w,t)  which  are  generated  by  the  substi¬ 
tution  of 

x (t)  ■  exp(+jwt)  (2.2-6) 


into  Equations  (2.2-1)  and  (2.2-3) 


H(w,t)  •  y(t)/exp(+jiot) 
H'(u,t)  »  y (t) 


J h(o,t 

o 

j h’(v. 


t)  exp(-jwo)  do 


t)  exp(+jwv)  dv 


(2.2-7) 

(2.2-6) 


The  quantity  H(w,t)  is  termed  the  "instantaneous  transfer  function1’  by 
64 

Zadch  while  H’(^,t)  is  termed  the  sinewave  response  by  Lindenlaub. 


The  positive  sign  convention  used  in  the  exponent  in  Equation  (2.2-6) 
is  opposite  to  that  used  in  Equation  (1.2-5),  Equation  (1.2-16)  and  the 
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general  literature  on  surface  scattering,  ' but  It  leads  to  slightly  more 
symmetric  results  In  this  chapter.  The  difference  In  conventions  Is 
reconciled  at  a  later  point  In  this  report. 

The  primed  and  unprimed  transforms  are  related  by 


H’(w,t)  ■  H(w,t)  exp(+jwt) 


(2.2-9) 


For  fixed  u  both  H‘(w,t)  and  H(u,t)  are  slowly  varying  with  t  . 
These  primary  axis  transforms  can  be  used  to  restate  Equation  (2.2-1), 
the  basic  linear  functional  relation  for  general  excitations: 

y(t)  -  y-  j  H'  (w,  t)  dzx(w)  «  J  H(w,t)  exp(+ju;t)  dzx(w)  (2.2-10) 

where  zx(w)  is  the  distribution  corresponding  to  x(t)  via  Equation 

(2.1-1). 

B)  Secondary  Axis  Transforms 

By  transforming  the  second  axis  of  the  primed  and  unprimed  impulse 

response  we  obtain  another  useful  pair  of  system  functions.  These  are 

* 

written  in  terms  of  spectral  distributions 


h(°,t)  “  ^  j  exp(jyt)  dG(c,y) 

J 

(2.2-11) 

Y 

h’(v,t)  -  f  exp(jyt)  dG'(v,y) 

(2.2-12) 

Y 


We  here  adopt  the  convention  that  dG(o,y)  represents  an  increment 
in  the  distribution  G  over  y  with  o  normally  an  independent 
parameter.  When  ambiguity  might  arise,  the  notation  d  G(c,B)  denotes 
dG(a,y) * (9&/3y)  assuming  the  derivative  exists.  Y 
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where  if  h,  h1,  G,  and  G'  are  all  well-behaved  functions  on  their 
primary  axes  we  have 

+T 

C(o,Y)  m  T~f  ~  1  dt  (2.2-13) 

-T 

+T 

G'(v,Y)  mj^J  — "  1  h^v,!)  dt  (2.2-14) 

-T 

Substituting  Equation  (2.2-11)  into  Equation  (2.2-1)  we  obtain 

y(t)  -  J  Jx( t-o)  exp(jyt)  dg  (2.2-15) 

o  Y 

or,  applying  Equations  (2.1-19a,b)  and  using  uniqueness  for  the  increment 
representation 

d*y(u)  -  fj'-iw  Vx(u-Y)  d0  (2.2-16) 

0  Y 

Following  Ellinthorpe  and  Nuttall  we  observe  that  the  quantity 

dwZx<u>-Y)  exp(-j(w-Y)a)  (2.2-17) 

is  the  spectral  increment  for  a  a  time  delayed  and  a  y  frequency 
shifted  replica  of  the  input  signal  x(t)  .  Thus,  since  Equation 
(2.2-16)  represents  the  spectral  increments  of  y(t)  as  a  superposition 
of  these  terms,  G(o,y)  measures  the  extent  to  which  the  input  is  spread 
along  the  delay  and  frequency  axes.  For  this  reason  G(o,y)  is  termed 
the  system  distributional  "spreading"  function. 

By  comparison,  on  applying  similar  methods  to  Equations  (2.2-3)  and 
(2.2-14)  one  obtains 
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(2.2-18) 


dZy(y)  -  J*x(v)  dv 

v 

This  relation  is  considerably  simpler  than  Equation  (2.2-16),  but  the 
Interpretation  of.  G'(v,y)  is  not  as  intuitively  appealing  as  it  is  for 
G(o,y)  .  Following  Lindenlaub  we  refer  to  G’(v,y)  simply  as  the  impulse 

t 

spectral  distribution.  The  relation  between  the  impulse  spectrum  and  the 
system  spreading  function  can  be  found  by  substituting  Equation  (2.2-4) 
into  Equation  (2.2-14)  and  applying  Equation  (2.2-11): 


G'(v,y) 


+T 

Lim  f  exp(-lvt)-! 

T-ho  J  -jt 

-T 


(t-v,Y’> 


)  dt  (2.2-19) 


The  complexity  of  this  relation  accounts  for  the  simplicity  of  Equation 
(2.2-18)  and  the  usefulness  of  the  primed  kernel  depends  on  its  avail¬ 
ability  from  other  considerations. 

C)  Dual  Axis  Transforms 

Finally,  we  explore  the  fourth  transform  pair  in  this  symmetric 
class  of  system  functions.  These  are  the  primed  and  unprimed  bi-frcquency 
relations: 


dg(w,Y) 

dg ' (w,y) 


exp(-jwo)  dG(a,Y)  do 
exp(+ju>v)  dG'(v,Y)  dv 


(2.2-20) 


(2.2-21) 


v 

The  relation  of  these  bi-frequency  spectral  distributions  to  the  fast 
axis  system  functions  may  be  found  by  transforming  Equation  (2.2-11)  and 
Equation  (2.2-12)  on  o  and  v  respectively: 


h(w,t) 


J  dg(o),Y> 

Y 


(2.2-22) 
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a'(w,t) 


(2.2-23) 


dg' (w,y) 


Furthermore,  by  applying  the  inverses  of  Equation  (2.2-7)  and  Equation 
(2.2-8)  these  become 

h(o.t)  -  jj  eJ(lrt+“o)  (2.2-24) 

(t)  y 


h'(v.t)  -  J J  (yt'“v)  |a  (2.2-25) 

u  y 

On  substitution  of  Equation  (2.2-9)  into  Equation  (2.2-23)  and  comparing 
with  Equation  (2.2-22)  we  find  by  uniqueness  that 


dg'(w,Y)  »  d^g(u,Y“w) 


(2.2-26) 


The  quantity  g(w,Y)  is  termed  the  bi-frequency  spectral  distribution 
function  (following  the  notation  of  Ellinthorpe)  while  g'(w,Y)  is 
termed  the  sinewave  spectral  distribution  function  (following  Lindenlaub) . 
By  executing  the  integration  over  o  in  Equation  (2.2-16)  and  invoking 
Equation  (2.2-20)  we  obtain  the  last  set  of  input-output  relations  for  the 
fundamental  system  functions: 

dZy(u))  *  ^  J  d^g(u>~Y »Y>  dwzx(w"Y)  (a) 

Y 


[  d  g(u)',w-u)')  d  ,z  (a)’) 

Z7T  J  U  (i)  X 


0) 


u)  dz  (u> ’ ) 

A 


(b)  (2.2-27 

(c) 
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By  comparison  with  the  distribution  G(o,y)  which  measures  the  spread  of 
frequency  shifts  for  various  arrival  delays,  g(w,y)  measures  the  total 
amount  of  "leakage"  of  signal  at  input  frequency  u  into  output  frequency 
Y  cycles  displaced  from  u  .  It  therefore  offers  a  natural  scheme  for 
analysis  or  decomposition  of  system  response  into  side  bands. 
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By  comparison  with  the  distribution  G(o,y)  which  measures  the  spread  of 
frequency  shifts  for  various  arrival  delays ,  g(u),y)  measures  the  total 

amount  of  "leakage"  of  signal  at  input  frequency  u  into  output  frequency 
Y  cycles  displaced  from  <j  .  It  therefore  offers  a  natural  scheme  for 
analysis  or  decomposition  of  system  response  into  side  bands. 
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2.3  Cascaded  Linear  Time  Varying  Systems 

Lindenlaub  observes  Chat  the  Input-output  working  relationships  for 
LTVF's  are  generally  somewhat  simpler  when  expressed  in  terms  of  the 
primed  system  functions.  This  feature  makes  the  primed  notation 
especially  suitable  for  analysis  of  cascaded  chains  of  LTVF's  (see 
Figure  2.3-1).  To  illustrate  this  let  us  consider  n  LTVF's  cascaded 
in  this  manner.  By  repeated  application  of  Equation  (2.2-3)  one  can 
immediately  show  that  the  overall  impulse  response  h|n(v»t)  for  the 
cascade  is  given  by 


hin(v-t> 


I  /  "'  /  hi*v’vi)  dvi 
V1  v2  Vi 


^2^V1*V2^  ^v2 


•  •  • 


dv  .  h’(v  ,t) 
n-l  n  n-l 


(2.3-1) 


Although  this  expression  is  very  symmetric  it  is  clear  that  generally 
the  n  systems  in  the  chain  cannot  be  permuted  or  interchanged  without 
changing  the  overall  behavior  of  the  chain. 

-  h^v.v^ 


hin(v.t) 


Cascade  of  n  Linear  Time  Varying  Filters 
Figure  2.3-1 


Since  h|(v^_^fv^)  is  usually  not  known  a-priori  in  many  practical 
situations  a  frequency  domain  counterpart  for  Equation  (2.3-1)  is 
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desirable.  By  utilizing  Equations  (2.2-12)  and  (2.2-25)  one  obtains  the 
following  intermediate  result 


J  ^|(v»v1)  d';i  ^2^v1*v2^  " 


/  /  kdg2(VY; 


(2.3-2) 


)  e 


Jy^ 


2  2 


Y1  y2 


Through  continued  application  of  Equation  (2.2-25)  and  a  final  use  of 
Equation  (2.3-23)  we  obtain 


Clearly,  nan}  alternative  representations  for  cascaded  systems  are 
possible.  In  particular,  it  is  occasionally  useful  to  force  the  first 
system  in  the  chain  to  be  represented  by  its  sinewave  response.  For  a 
two  element  chain  we  have  the  especially  simple  result  due  to  Lindenlaub: 


Hj2(u»,t) 


-  J  H|(w,v1)  h^(v1,t)  dvx 

V1 


(2.3-4) 


Since  the  scattering  system  is  usually  the  first  element  in  a  typical 
chain,  it  is  very  advantageous  to  have  the  leading  element  a  primary 
axis  transform. 

As  in  Equation  (2.3-3)  we  may  extend  Equation  (2.3-4)  by  sequential 
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application  of  Equation  (2.2-25)  followed  by  a  final  use  of  Equation 
(2.2-8)  with  the  following  result  for  long  chains: 


Hin<u>t)  “//"•/  2T  dG2(vl*Y2) 


V1  y2 


n-1  (2.3-5) 

*'*  2*  d8n-l(Yn-2,Yn-l)  2v  Hn(Yn-l,t:) 


Finally,  by  consecutive  application  of  Equation  (2.2-27)  one  can 
easily  generate  still  another  relation: 


d8in(w,Y* 


/  /  /  d8i<w*Yi)  2k  d«2*Yl,Y2*  2 it 

Y1  Y2  Yn-1 

27  dgn(Yn-l»Y) 


(2.3-6) 
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2.4  Stationary  Stochastic  Systea  Correlations 


In  section  2.2  the  properties  of  general  LTVF’s  are  examined 
subject  to  general  excitations.  No  explicit  use  is  made  there,  however, 
of  the  possible  random  nature  of  either  the  signals  or  the  systems.  In 
this  section  we  investigate  various  properties  of  Random  Linear  Time- 
Varying  Filters  (RLTVF' s)  subject  to  excitation  by  stationary  random 
variables.  The  loss  of  signal  correlation  on  passage  through  RLTVF 's 
is  measured  from  several  points  of  view.  The  degradation  of  the  cross¬ 
correlation  between  the  input  and  output  is  determined.  The  deteriora¬ 
tion  of  the  auto-correlation  of  the  output  with  respect  to  the  input  is 
obtained.  In  addition,  we  examine  the  cross-correlation  between  outputs 
of  RLTVF’s  taken  in  parallel  and  in  series  cascade. 

We  distinguish  in  the  following  between  averages  carried  out  over 
the  ensemble  of  excitations,  the  ensemble  of  random  or  scattering  filters, 


and  the  union  of  these  two  sets.  Ensemble  averaging  over  excitations  is 
denoted  by  an  overbar  labeled  "x"'  while  RLTVF  ensemble  averaging  is 
denoted  by  "s"  . 

Also  discussed  is  the  definition  used  for  system  stochastic  station- 
arity  and  a  variety  of  useful  system  correlation  functions. 


A)  The  Input-to-Output  Cross  Decorrelation  Function 


The  simplest  stochastic  input-output  relation  for  a  single  RLTVF 
concerns  the  input-output  cross-correlation: 


R  (t) 
xy 


_  f _ s _ : 

x ( t+T )  y (t)  -  J  h(o,t)  x ( t+T )  x(t-c) 


h  (o)  R  (t-o)  d<a 


(u>)  S  (w)  exp(+jwT)  t— 
c  xx  2n 


f  H  (u»)  S 
I  c  x 

J  U) 

f  Dc(t.o' 
J  „ ' 


)  R  (o'.)  do 

XX 


where 


_ 8  _ 8 

(a)  h  (o)  -  h(a,t)  ;  (b)  H  (u>)  -  H(w,t)  (2.4-2) 

c  c 

are  respectively  the  so-called  coherent  impulse  response  and  coherent 
transfer  functions  for  the  RLTVF  .  These  are  related  by 

hc(o)  -  fncM  (2.4-3) 

b) 

Equations  (2.4-2),  of  course,  assume  system  stationarity  at  least  in  the 
mean. 

Because  scattering  invariably  causes  loss  of  correlation  we  refer 
to 


Dc(t,o')  »  hc(t-o') 


(2.4-4) 


as  the  input-to-output  cross  decorrelation  function  or  the  "straight 
through"  coherency  degradation. 

B)  The  Output  Auto  Decorrelation  Function 

The  second  stochastic  input-output  relation  for  a  single  RLTVF 
concerns  the  output  auto-correlation: 


Ryy ( t , t ' )  -  y (t)  y(t')* 


(2.4-5) 


( 


^)  J  J  H(u>,t)  H*(w' , t ' )  exp(jwt-ju),t')  dzx(w)  dzx(o)')* 


U)  u 


Assuming  that  x(t)  is  a  stationary  random  variable  in  the  wide-sense 
then  from  Equations  (2.1-5)  and  (2.1-8) 
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(2-21) 


Ryy(t»t')  m  2n  J  ejtP(J (t-t,)w)  dZ^U) 

-  J* (w.u^t.t')  exp(j(t- t*)w)  S^w)  ~ 

U) 

where  the  latter  equality  holds  in  a  spectral  density  S  (w) 

XX 

Here  we  define 


(2 .4-6) 


exists . 


_  8 

♦  (u>,u) *  ,t,t ')  ■  H(w,t)  H*(u),lt') 


(2.4-7) 


as  the  interfrequency  system  correlation  function.  It  can  be  seen  that 
a  sufficient  condition  for  the  output  of  the  RLTVF,  y(t),  to  be 
stationary  in  the  wide  sense  is 


♦  (w,w»t,t *)  ■  #(u),w,t-t ' )  (7.4-8) 

We  call  such  a  system  a  wide  sense  stationary  system  (WSS) .  For  this 
class  of  systems  we  have 


R 

yy 


(t) 


2?  /*<“•“• 


t)  e 


jwT 


dZ  (w) 

XX 


b) 


(2.4-9) 


Note  that  since  the  mean  of  x(t)  is  assumed  to  be  zero,  y(t)  is  also 
zero  mean. 

Now  we  define  a  more  restrictive  class  of  RLTVF's  which  satisfy  a 
broader  type  of  stationarity : 


$(cj,w' ,t,t ')  ■  4>(u,u)’ ,  t-t  * ) 


(2.4-10) 


Such  systems  are  here  termed  interf requency-wide-sense  stationary  (IWSS) . 
All  of  the  random  scattering  systems  to  be  studied  satisfy  this  condition. 
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An  important  property  of  this  class  of  systems  1.  that  one  may  define  for 
it  a  new  spectral  distribution: 


♦  ,t) 


d^HACtUpw'^w") 


(2.4-11) 


We  refer  to  the  function  A  a.  the  "tri-frequency"  spectral  dietributlon. 
By  substituting  Equation  (2.4-11)  into  Equation  (2.4-9)  we  obtain  a 
stochastic  input-output  relation 


R  (t)  -  ( 

yy  v 


b'JJ 

*  It 

0)  CD 


J (u>+U>") T 

e  d „A(w,»,«M)  dZ  (a>) 

m  xx 


(2.4-12) 


m  ^  f  jw'T  f 1  f 

2ir  J  e  {2n  J  dzxx(w>) 

«*  w 

By  the  uniqueness  of  the  spectral  increment  representation  this  becomes* 


(2.4-13) 


From  the  similarity  between  Equations  (2.4-13)  and  (2.2-27)  is  seen  that 
the  function  A(u>m,u’-m)  operates  on  the  input  power  spectral  increments 
dZxx(w)  in  the  same  way  that  g(o,u'-u>  operates  on  dr  (u)  .  That  is, 

X 


du,A(«»,w,w'-w)/dw’  -  r(w,w’)  is  called  a  bi-frequency  function  by 
Zadeh  and  others  In  the  literature. 
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A  measures  the  average  amount  of  input  power  at  frequency  u  which  is 
parametrically  ’’pumped"  into  output  frequency  m  . 

This  relationship  between  the  tri-frequency  and  bi-frequency 
functions  can  be  obtained  by  a  more  direct  method.  Starting  with 
Equation  (2.2-22)  we  have 


♦  (WjU)'  ,t-t  ’ ) 


<b‘  j  j 

Y  Y* 


dg(w,Y)  dg*(w’,Y') 


(2.4-14) 


This  in  turn  suggests  the  relation 


dglw.y)  dg*(a)’  ,y’  )  *  2  it  6(y-y')  dA(a),u,,Y) 


(2.4-15) 


That  is,  the  increments  in  the  bi-frequency  distribution  function  are 
orthogonal  even  at  different  values  of  the  fast  axis  transform  frequencies. 

The  orthogonality  of  the  bi-frequency  function  inr-ements  obviously 
carries  over  to  the  increments  of  the  system  spreading  function  distribu¬ 
tion.  Inverting  Equation  (2.2-20)  we  find 


dG(a, y)  dG*(o ’ , y ' ) 
2 


(^)  j  j  exp  ( -  j  wa+ j  w '  o  ’ )  dg(w,  y)  dg*(w'  ,y')  dm  dm' 

U)  u’ 


(2.4-16) 


2tt  6(y~Y ')  dB(o,o,,Y) 


where 


c1B(o,o’,y)  ■  J  J  dA(a),m,,Y)  exp{-j  (mo-u'o')  ) 


(2.4-17) 


m  ai 


This  result  may  be  combined  with  Equation  (2.2-16)  to  yield  a  power  spectrum 
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spreading  function  relationship 


dz  (<o)  dz  (w')* 

y  y 


11,11, 
a  a  y  y 


2*  dZ 


xx 


■  2ir  6(w-ui')  dZyy(w)  ■ 

!  2 _ 

(u>-Y)  fiCw-w'-Y-hr')  (^)  dG(o, y)  dG*(o 

exp{-j  ((0-y)o+J  (u>'-y')o'  }  do  do' 


(2.4-18) 


Using  Equation  (2.4-16)  and  integrating  Equation  (2.4-18)  with  respect  to 
u) '  and  y  ' 


dZ  (u>) 

yy 


dZ  (w-y)  exp{-j  (u>-y)  (o-o*)  }  dB(o,o',Y)  do  do' 


111 

Y  o  o' 

J J  dZxx(w-Y)  exp{-j(uj-Y)o")  do" 


(2.4-19) 


Y  o’ 


where  o"  -  o'-o  and 


dC(o",Y)  ■  J dB(o,o+o”,Y)  do 


(2.4-20) 


Here  C  might  be  termed  the  power  spectrum  "smearing"  function. 
Equation  (2.4-20)  shows  that  the  output  signal  correlation  can  be 
represented  as  a  distribution  replicas  of  the  input  correlation  which 
are  shifted  in  frequency  and  over  arrival  times. 

This  leads  us  immediately  to  the  most  enlightening  stochastic 
input-output  relation.  From  Equation  (2.4-19)  we  have 


R 

yy 


(T) 


/  /  JdZxx(“-Y) 

o  co  y 


oxp{  j  (ury)  (t~o)  )  (y^) 
e^T  dC(o,y)  dto  do 


*  /  VT-°'> 

o' 


R  (o')  do' 
xx 


(2.4-21) 
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where  D  is  the  auto-decorrelation  function: 
a 

D^.t.o')  *  J  ex?^YT)  ~ -^^a— 

Y 

m  2rt  J  J  exP^YT)  dB^*', o'^x-d'  ,y)  doM 

Y  o" 


(2 . 4—22) 


1_ 
2  it 


If  II. 

0  y  loio 


cxp{j  [  Yf+w0,,“u*  (o”— X— 0 * )  ]  } 


dA(a),u)'  ,y) 


da>  da)'  do” 
2tt  2 it 


Performing  first  the  integral  ever  o”,  then  the  integral  over  w'  and 
finally  applying  the  definition  of  the  tri-frequency  function  given  in 
Equation  ( 2 . 4—11)  we  find 


Da(x,o') 


/ 


$(u),W,x) 


U) 


+jw(T+o')  dw 
e  2  IT 


(2.4-22) 


Alternatively,  the  auto-decorrelation  function  can  be  written  in  terms  of 
the  second  moment  of  the  impulse  response: 


Da(x,o’)  -  J  H(wtt)  H*(o),t-T)  exp{+ju>(T+o')}  ^ 


0) 


I  j  j  h(o, t)  h(o” , t-x)  exp{je>(o-o"+x+a,)  }  ~  do  do" 


(2.4-24) 


0)  o  o 


Again,  by  performing  first  the  integr.il  over  w  and  then  over  o"  this 
reduces  to 


D  (x ,o  ’ ) 
a 


I 


h(o,t)  h(o+x+o’ ,t-x) 


do 


(2.4-25) 
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Equation  (2 . A— 23)  Is  for  our  purposes  more  useful  than  Equation  (2.4-25). 
However,  the  latter  relation  Is  Important  because  It  demonstrates  that 
the  smearing  of  the  signal  correlation  function  ca^  be  noticeable  even 
when  the  instantaneous  impulse  response  tends  to  be  narrower  than  the 
input  correlation  width.  This  occurs  when  the  slow  system  variations 
tend  to  translate  the  instantaneous  centroid  of  the  impulse  response 
along  the  delay  axis. 

C)  Parallel  RLTVF  Cross  Decorrelation 

By  entirely  analogous  reasoning  we  may  extend  all  of  the  single 
RLTVF  decorrelation  equations  to  the  problem  of  describing  the 
decorrelation  of  the  outputs  of  two  RLTVF' s  in  parallel  and  excited 
from  a  common  source  (see  Figure  2.4-1). 


x(t) 


y2ct) 


Parallel  RLTVF  Decorrelation 


Figure  2.4-1 


Here  one  distinguishes  between  ensemble  averages  carried  out  over  the 
random  parameters  of  H^(w,t)  and  ^(wjt)  by  denoting  the  former  by 
"s  "  and  the  latter  with  "62”  .  Again  one  defines  a  system  correlation 
function: 


$12(a),w'  ,t,t')  »  H1(w,t)  ^Oo'.t')* 


S1S2 


$21(w’ ,w,t' ,t)* 


(2.4-26) 
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•  •  1 


One  defines  interfrequency  uide-sense  cross  stationary  (IWSCS)  by 


•izCw.w'.t.t*)  -  •|2<w,tt*#t-t') 


(2.4-27) 


It  is  not  difficult  to  see  that  all  of  Equations  (2.4-5)  through  (2.4-25) 

may  be  rewritten  with  the  appropriate  ‘12*  subscripts.  The  relations 

of  primary  interest  are  summarized  here: 


dZy1y2(“)  "  *»  /  d/i2(u,,»u),’w-w,> 


(2.4-28) 


♦12(«.w\o)  mjfcJ  exp(+j w"0)  dA12(tt,«»#«") 


(2.4-29) 


R  (t) 
yly2 


hi 


^12(a),aj,T)  exp(ja)t)  dZ  (w) 

XX 


(2.4-30) 


J  D12(t,o')  Rxx(o')  da' 


(2.4-31) 


D12(t,o)  a  N12(w,ajla)  exp{+jw(T+a)}  ~ 

J  2tt 


/  hi(°'> 


t)  h2(o'+T+a,t-T)  do' 


(2.4-32) 


(2.4-33) 


Here  A12(i^,u  ,ui  )  snd  I>12(t,o)  are  termed  the  cross  trl-frequency 
function  and  the  output  cross  decorrelation  function  respectively. 

^ — Series  _C_as cade  Output  Auto-Decorrelation 

Finally,  we  examine  the  problem  of  cascading  RLTVF's.  For  the  two 
fUter  cascade  shown  in  Figure  2.4-2  we  have  from  Equation  (2.3-3) 


B-G2 


(2.4-34) 


.  -jwt  ( 

V"t}  -  e  -  V~  J »*<*•*> 

Y 

» 

-  J%(y. O  e+jYt  dga(U,Y-U) 

Y 

where  Equations  (2.2-9)  and  (2.2-26)  are  used.  From  this  the  system 
inter-frequency  correlation  function  for  the  cascade  may  be  computed. 
Assuming  Independence  between  the  random  parameters  of  systems  a  and 
b  we  have 

4>ab(u),u)’  ,o)  *  Hab(w,t)  Hab(u'  ,t-o)*  ■ 

{(^)  J  J  Hb(u),t)  Hb(u)’ ,t-o)*  cxp{j(Y-Y')  t+jy'o)  (2.4-35) 

Y  y' 


dg_(w,Y“w)  dg  (u) * , y ’ “W * ) * )  exp{-j  (w-w'H-jw'a} 

o  fl 

Invoking  Equations  (2.4-7),  (2.4-15)  and  performing  the  integration  over 
y'  this  becomes 


4>ab^ai,U5,»°^  *  eXP2~‘^Uja*  J  ^(Y.Y-u+u'  »<0 

Y 

exp(+jY0)  d  A  ((*),«*, y-w) 

T  o 


(2.4-36) 


x(t) 


y(t) 


Cascade  RLTVF  Decorrelation 
Figure  2.4-2 
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The  cascade  tri-frequency  function  and  auto-decorrelation  function  are 
then  readily  generated.  Por  example,  by  rewriting  Equation  (2.4-11)  as 
follows : 

^(y.Y -wHtf'.o)  m2^[  ej(a)M-fu)“Y)o  dw„Ab(Y,Y-w4ta,,o),,+a)-Y)  (2.4-37) 

u>" 

and  substituting  the  result  into  Equation  (2.4-36)  we  have 

(2.4-38) 

Finally,  the  auto-decorrelation  function  for  the  cascade  is  particu¬ 
larly  simple  to  obtain.  Using  Equation  (2.4-21)  we  have 


(2.4-39) 

where  D  (o",o)  and  D,  (1.0")  are  the  auto-decorrelation  functions  for 
a  0 

systems  a  and  b  respectively. 

Another  useful  relation  which  is  effectively  the  dual  of  Equation 
(2.4-36)  may  be  obtained  by  starting  with  Equation  (2.3-4): 


♦^((U.u’.o)  -  e~i(u~u  )t+j“  0  H^b(w,t)  H^b(u',t-o)* 
e-j<u™-)t-Vo JJ  Hji(u,v)  e+j(®v-ca'v')  (2.4-40) 

V  V1 

h.  (t-v, t)  h.  (t-o-v'  ,t-o)  dv  dv' 

u  D 
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or 


*ab(“>"'.®>  *  e~^w’a  J  ^(u, „>,„»)  a+Ju'v" 


tb(u-u',o,v")  dv"  (2.4-41) 


where 


Vr.o.v)  -  J  elr°'  hb(or7o'hb(o1-o+v,t-o)  do1 


(2.4-42) 


“i8ht  be  *“"*  the  e“ende<i  auto"decorrelatlon  function  for  system  b 
because 


T^(0,T,a)  *  D^(T,a) 

Rewritten  in  the  frequency  domain  Equation  (2.4-42)  becomes 

Tb(r.«.v)  •  (jp2  J  J  J  ,J ro'VY,(Y,v 


(2.4-43) 


,v) 


0*  y  »  yH 


(2.4-44) 


exptjY'a’-jY'^o’-j-o+v)}  da'  dY'  dY" 

f  *b(Y,,-r.Y",v)  ejY‘,(v+0>  il" 
in  2  IT 


»here  the  last  equality  is  obtained  by  first  integrating  over  o'  and 
then  over  V  .  The  relative  usefulness  of  the  dual  Equations 
(2.4-41)  and  (2.4-36)  depends  on  the  ease  of  integration. 
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2 . 5  Time  Invariant  and  Slowly  Varying  Systems 


Before  closing  this  chapter  we  simplify  various  quantities  derived 
in  sections  2.2  and  2. A  for  the  case  of  time-invariance.  First  we  have 
the  specialized  versions  of  the  eight  fundamental  system  functions: 

a)  h(o,t)  -  hf(o)  b)  h'(v,t)  -  hf(v-t)  (2.5-1) 

c)  H(o), t)  ■  H^fu)  d)  H'(o>,t)  ■  H^(o))  e+^wt 

e)  dG(o ,y)  ■  hf(o)  6(y)  dY  f)  dG'Cv.y)  ■  Hf(y)  e  dY 

g)  dg(u,Y)  “  2 it  6(y)  Hj(uj)  dY  h)  dg*(w,Y)  ■  2n  6(y-w)  H^(w)  dY 

where  the  "fixed”  or  static  impulse  response  is  related  to  its  transfer 
function  by  the  usual  formula 

hf (o)  »  J Hf(m)  exp(+ja>o)  (2.5-2) 

<D 

Thus,  Equations  (2.5-1)  (e)  through  (h)  show  that  there  exists  spreading 
only  along  the  "delay"  axis  and  no  frequency  "leakage”  when  the  system  is 
time  invariant.  For  a  very  slowly  time-varying  system  the  frequency  smear 
therefore  tends  to  be  narrow. 

For  the  completely  time  invariant  system  the  input-output  relations 
reduce  to  the  familiar  forms: 


a)  y (t) 


J hf(a) 


x(t-o) 


do 


b)  y(t)  -  j H{(U)  eJu,t  dzx(u)  (2.5-3) 
0) 


The  stochastic  system  correlation  function  and  distributions  are  reduced 
for  time  invariance  as  follows: 
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a) 

b) 

c) 

d) 

e) 

f) 


♦(w,w f,o)  -  ^(w.o,')  -  Hf(w)  Hf(w')* 
dA(a},u>' ,w")  -  Mw.w’)  6 (a»M)  dwM 


dB( °>o\y)  -  hf (a)  hf (a1)  «(Y)  dy 
T(y,T,a)  -  J  ejYa  nfu')  hf(o'+T+o)  do»  - 


Tf (y»o+t) 


(2.5-4) 


Da(T,o)  -  Tf (0,o+t) 
dC(oM,y)  -  Tf(0,o")  6 (y)  dy 


For  purely  non-random  filters  the  overbar  la  deleted. 

Fro*  Equatlon  (2.5.4b)  u  l8  sefin  that  ln  (he  11-lt  of  yery  slQwiy 

time-varying  systems  the  tri-frequency  distribution  weights  the  point 

“  '°  V6ry  h6aVlly'  Thus-  in  Equation  (2.4-36)  the  distribution 
for  the  first  system  of  a  two  system  cascade  weights  the  point  y.u 

heavily.  Therefore,  provided  the  function  Vy,YW,a)  is  suitably 

smooth  in  the  vicinity  of  this  point  we  have  the  approximate  result  that 


(2.5-5) 


In  this  limit,  of  course,  the  order  of 


the  systems  in  the  chain  may  be 


reversed.  In  fact,  to  the  same  limit  we  have 


Hab(u,t)  -  Ha(u,t)  Hb(w,t) 


(2.5-6) 
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CHAPTER  3 


STATISTICS  OF  THE 
FINITE  TIME  CORRELATOR  OUTPUT 


3.0  Introduction 

In  this  chapter  practical  multipath  and  array  correlator  detector- 
trackers  are  considered.  The  mean  and  variance  of  the  correlator  output 
are  determined.  The  correlator  Input  consists  of  signals  which  may  be 
degraded  by  surface  scattering  and  additive,  Gaussian,  stationary  back¬ 
ground  noise.  The  signals  are  assumed  to  be  Gaussian  and  stationary  when 
they  are  emitted  at  the  target.  The  scattering  is  represented  by  passage 
of  the  signal  through  an  interfrequency  wide-sense  stationary  RLTVF. 

Although  the  IWSS  assumption  guarantees  that  the  scattered  signals  retain 
their  stationarity ,  it  is  not  safe  to  assume  that  they  remain  Gaussian 
processes.  The  departure  from  Gaussian  statistics  is  examined. 

It  is  shown  that  correlator  output  exhibits  fluctuations  that  include 
contributions  due  to  the  noise,  to  the  randomness  of  the  target  signal, 
and  to  the  time-variation  and  randomness  of  the  scattering.  These 
fluctuations  are  termed  estimation  noise  since  the  output  of  the  correlator 
operating  during  the  finite  interval  [0,T]  is  only  an  estimate  of  the 
ergodic  mean  value  obtained  as  T  -*■  •  .  Of  the  three  components  of  the 
output  variance,  the  fluctuation  due  to  the  slow  time-variation  of  the 
scattering  ii  by  far  the  most  persistent.  This  component  occasionally 
forces  the  use  of  extremely  long  processing  intervals.  Expressions  for 
the  error  probabilities  of  a  two  sided  test  on  correlator  output  are 


derived. 


3 . 1  Cross-correlation  of  Direct  and  Surface  Reflected  Paths 

In  this  section  we  consider  a  propagation  geometry  in  which  both 
direct  and  surface  reflected  paths  are  received  by  a  single  receiver  (see 
Figure  1.0-la).  It  is  assumed  that  each  path  can  be  processed  separately 
by  the  receiver.  This  might  be  accomplished  by  using  multi-directional 
sensors,  but  the  exact  nature  of  the  separation  will  not  concern  us  at 
this  point.  A  block  diagram  for  the  situation  is  shown  in  Figure  3.1-1. 


Propagation  Signal  Processing 


Direct  vs •  Surface  Reflected 
Cross-correlator  Processing 
Figure  3.1-1 


The  two  channels  are  assumed  to  be  corrupted  by  independent  noises 
n^(t)  and  ^(t)  .  The  filters  H^(w)  and  ^(w)  have  been  included 
in  order  to  provide  pre-processing  before  correlation.  The  variable 
delay  parameter  t  has  been  included  as  a  "search"  parameter  for 
estimating  the  multipath  replication  delay.  The  direct  path  is  assumed 
to  transmit  a  delayed  and  attenuated  version  of  the  emitted  signal 


-JejR./c 

e 


(3.1-1) 
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where  is  the  line-of-sight  distance  from  target  to  receiver.  Follow¬ 

ing  Equation  (2.4-34)  we  define  the  cascade  responses 


Hsl(u),t)  -  ~ —  J  H^Cy)  e^Yt  dg8(w,y-w) 


(3.1-2) 


Hd2(w)  "  Hd(u>)  H2(tu) 


(3.1-3) 


The  correlator  operates  during  the  interval  [p,p+T]  to  yield  the 
output  E(t,T,p)  at  time  p+T  .  Since  the  correlator  acquires  random 
data  for  only  a  finite  time  T,  5  is  a  random  variable.  Its  mean  is 
easily  computed  (using  Equation  (2.2-10)): 

P+T 


5(t,t,p)  -  y  J  y1(t)  y2(c)*  dt 


P+T 

Y  J  J  [RJTw)  dznl(w)  +  Hsl(u,ti  dzx(w)}  exp(jwt) } 


b) 


(3.1-4) 


■kj 


{?-  /  tH0 C^’ )  dzn2(u)')  +  Hj0((d')  d.zv (c*> * ) )  exp{ju>'  (t-x) } }*  dt 


d2 


0) 


On  expanding  the  product: 


P+T 


H(t,T,P)  -  £  {( 


/  / », 

w  w 


(u>)  [H2(w’)*  dznl(u>)  dzn2(u)’)* 


+  Hd2(w')*  dznl(w)  dzx(w')*]  +  Hgl(w,t)  [H^w')*  dzx(w)  dz^w')* 


+  11^2  ^  dzx(w)  dzx(w’)*]]  exp(j  (cj-u)' )  t+ju)' t  } }  dt 


(3.1-5) 
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Invoking  the  independence  of  n^(t),  n2(t)  and  x(t)  and  applying 
Equation  (2.1-5)  this  becomes 

p+T 

-  if  f  _  .JUT 

-(t »T»p)  -  [  7  J  dtl  (  J  Hd(o))*  H2(u)*  Hgl(w,t)  f2VdZxx(tl))1 

p  w  (3.1-6) 

The  mean  of  Hg^(u),t)  can  be  obtained  by  averaging  Equation  (3.1-2)  and 
applying  the  following  relation  (which  may  be  verified  by  insertion  in 
Equation  (2.2-22)) 


dg  (u>,Y~w)  "  2tt  H  (w)  6(y-u)  dy  (3.1  7) 

s  c 

with  the  result  that 


H  .(w.t)  -  H  <u>)  H.  (u)  (3.1-G) 

Bi  C  i 

Finally,  substituting  this  into  Equation  (3.1-6)  we  have 


5(t,T,p) 


/ 


+jU)T 

H  (u>)  H  (oj) *  H  (w)  H.(u>)*  -  dZ  (w) 

c  a  12  2TT  XX 


(3.1-9) 
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3 . 2  Second  Order  S tat istics  for  the  Output  of  the  Cross-correlator 

In  this  section  we  regard  the  output  E(r,T,p)  of  the  cross¬ 
correlator  shown  in  Figure  (3.1-1)  as  a  time  series  which  is  a  random 
function  of  the  initial  starting  time  p  for  the  integrator.  This  output 
fluctuates  afout  its  mean,  causing  uncertainty  about  the  possibility  of 
the  existence  of  target  signal  correlation.  The  most  general  second  order 
statistic  that  is  of  interest  in  analysis  of  this  fluctuation  is  the 
cross-covariance  of  E  itself  at  two  values  x  and  x*  for  the  replica¬ 
tion  delay  parameter  and  for  two  values  p  and  p'  for  the  starting 
time : 

A(t,t' ,T,p,p')  =  E(t ,T,p)  E(x ' ,T,p ')  -  5 (x ,T,p)  S(t',T,p')  (3.2-1) 

This  quantity  determines  the  spectrum  of  the  output  fluctuation,  the  per¬ 
sistence  and  magnitude  of  these  fluctuations,  and  the  degree  of  dependence 
of  f luctutations  at  different  values  of  the  replication  delay  parameter. 

In  this  section  we  focus  attention  on  the  computation  of  the  first 
term  in  Equation  (3.2-1).  The  method  used  is  essentially  that  given  by 
Davenport  and  Root,^  Laning  and  Battin,^  and  Bendat^  (see  also 
Usher‘d) .  From  the  definition  of  E  we  have 


(3.2-2) 


(3.2-3) 


E(t,T,p)  E(t\T,V) 


p+T  p'+T 


(i) 


iKt,!'  ,t,t’)  dt  dt' 


where  we  define 


IUt.t’ ,t,t')  -  y^t)  y^t')  z(t-r)  z(t'-T’) 
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At  this  point  it  is  assumed  that  this  fourth  order  moment  is  stationary, 
that  is,  we  assume  that  the  relation 

n(T,T\t,t')  -  iKT.T’.t-t')  (3.2-4) 


holds.  We  shall  return  to  this  assumption  in  sections  3.3  and  3.5  to 
determine  the  conditions  under  which  Equation  (3.2-4)  may  be  applied. 
When  Equation  (3.2-4)  is  valid  it  suggests  the  change  of  variable 


v  -  t-t ' 
dv  “  -dt* 


(3.2-5) 


Substituting  this  into  Equation  (3.2-2)  we  have 

2  i)+T 

S(t,T,p)  E(t ' ,T,p ' )  -  (~)  f  {  I  n(T,T',v)  dv  }  dt  (3.2-6) 

-p  "t-p’-T 

Next,  noting  that  the  integration  over  t  is  between  fixed  limits  while 
the  integrand  is  a  function  of  only  v  we  seek  to  reverse  this  situation 
by  interchanging  the  order  of  integration.  Unfortunately,  this  forces  us 
to  break  the  integral  up  into  two  terms,  one  for  the  range  v  <  p  -  p’ 
(area  A  in  Figure  3.2-lb)  and  the  other  for  the  range  v  >  p  -  p* 

(area  B) 
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Regions  of  Integration 
Before  (a)  and  After  (b)  Transformation 

Figure  3.2-1 


Executing  the  interchange  of  order  we  have: 


H(t,T,P)  =(t\T,P')  - 


p-p ' 

v+p 

dv  , 

f 

1 

Q. 

1 

0 

p-p '+T 

P+T 

r 

/  dv  J 

1 

1 

>  ,  J 

P-P 

v+p’ 

dt  n(T,T?  ,V>  + 


dt  n(T,x’ ,v) } 


(3.2-7) 


B-74 


The  integration  over  t  is  now  easily  performed  giving  the  result 


5(i,T,p)  5(t ' ,T,p ' )  - 
2  p‘p* 

(^)  {  f  [T+v-(p-p 1 ) ]  n(t,T?,v)  dv  + 

J  „  i 


P-p'-T 

p-p'+T 


L 


(T-v+(p-p ' )  ]  JI(t,t\v)  dv  } 


-  f  [V1(p-p,+T)  -  V1(p-p,-T)] 


-  (p  [V2(p-p*-T)  -  2  V2(p-p')  +  V2 (p-p'+T)] 

+  [V1(p-p'-T>  -  2  V1(p-pt)  +  V1(p-p'+T)  ] 


(3.2-8) 


which  is  a  function  of  (p-p1)  .  Here  we  define 


5 


a) 


v^(0  -  J  IICx.t'.v)  dv  b) 

o 


i 

V°  .  J 


V  II  (  T  ,  T  '  ,v)  dv 

o  (3.2-9 


Equation  (3.2-8)  simplifies  greatly  for  the  case  p  •  p’  and  x  •  x'  . 

In  this  case  we  may  interchange  t  and  t*  in  Equation  (3.2-3)  yielding 


n<T,  T | v)  -  n(x,  T,-v) 


(3.2-1C) 


Applying  this  to  the  first  integral  of  Equation  (3.2-8)  we  obtain  the 
well-known  result 


¥  J 


5(x,T,p)  5 ( t  ,T,p)  «  2  (-)  I  [T-v]  II(t,t,v)  dv 


(3.2-11) 


-  2  [  y  VX(T)  -  (|)  V2(T)  ] 
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3.3  Correlator  Fluctuation  for  Direct  and' 


Surf ace  Re  fie  cte  d^  Mult  lpath  Processor 

In  this  section  we  compute  the  fourth  order  product  function 
H(t , t ' , t , t ' )  defined  in  Equation  (3.2-3)  for  the  direct  vs.  surface 
reflected  path  processor  shown  in  Figure  3.1-1.  Rewriting  Equation 
(3.2-3)  in  the  frequency  domain  proves  to  be  useful  in  later  computations. 
Thus,  we  have  from  Equation  (3.2-3) 


IUt.t'  ,t,t') 
A 


(1_)  {  J  J  J  J  (to+<on)  t-j  ((o’+io" 1 )  t  '-j  (u^'t-w"  *  t  ' )  [ 


W  O)'  to"  U)"' 


[  H1(u)  d2nl(aj)  +  Hgl((0,t)  dzx(u)  ]  | 

i 

_T _ (3.3-1) 

[  H1(ai')  dznl(<o‘)  +  Ugl(<o',t)  dzx((o')  ]* 


(  H2(io")  dzn2(w")  +  H^2 (<o")  dzx(io")  ] 


[  H2(to'i')  dzn2(u>"')  +  Hd2(u>*i*)  dzx(u)'”)  ]*  '  } 

On  expanding  the  product  of  the  four  factors  in  Equation  (3.3-1)  we 
obtain  16  terms.  By  the  mutual  independence  and  zero  mean  of  n^(t) , 
n2(t)  and  x(t)  all  but  A  of  these  terms  are  found  to  be  zero.  The 
remaining  terms  are  as  follows: 


B-76 


n(T,T* ,t,t') 


rJL, 

‘2n; 


j (u>fu)  ")  t-j  (w'W'H'-J  (w''t-w"  1  t  ’ ) 


H2(com)  H2(u)"')*  dzn2(w")  dz^Cw"*)*  + 


Hd2(uJ,,)  Hd2(wM,)*  dzx(w">  J 


+  Hsl(u,t)  Hgl(u',t’)*  [ 


H2(oi")  H2(ui’'')*  dzx(w)  dzx(u’)*  dzn2(u>")  dzn2(u’")* 


+  Hd2(oi")  Hd2(o.'")*  dzx<aj)  dzx(u’)*  dzx(u")  dzx(u"’)*  j  } 

We  now  make  che  following  assumptions: 

(1)  The  noises  n^(t)  and  n2(t)  are  wide  sense 
stationary. 

(2)  The  signal  x(t)  emitted  at  the  target  is  a 
Gaussian  process  and  stationary. 

(3)  The  scattering  RLTVF  is  interfrequency-vide 
sense  stationary  (IWSS). 

Therefore,  using  Equation  (2.1-8)  and  (2.1-18) 


(3.3-2) 


\  (3.3-3) 

y 
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n ( t , t  * ,t,t’) 

<k>2 


,  J  (wfuM)  t- j  (u*  W  ’  )  t  '  - j  (ctf'W  '  T  •  ) 


0)  U)’  0)"  V" 


|Mu>)  I  S  n  (to)  6(co-co')  6(w"-fa)" *)  [ 
nlnl 


!«,(“")  I2  S„  „  (U")  +  |Hd2(o.")|2  S„>")  ] 


n2n2 


xx 


+  $g^(w,io'  ,t-t')  | 


|H0(a)")|2  s _ (u>)  S  (to")  6(o>-tu*)  6(wM-u),M) 


xx  n2n2 


(3.3-4) 


+  Kd2(w")  Hd2(a)"')*  [ 


S  (u)  6(<o-io')  S  (ai")  6(co"-co,M)  + 

XX  XX 


S„U)  6(arf(juM)  S  (u)" *)  fi^-ho')  + 

XX  XX 


S  (co)  S  (a)')  6(a)’- 


xx 


XX 


>’V)  ]  j  ) 


dco  da)'  dco"  da)"' 


where 


(tJJ, oj*  ,y)  -  Hgl(uj,t)  H  (u>',t-ii)  - 


2tt 


H1(y)  H*(y-w+o)') 


Y 


d  A 
Y  s 


(co,  co  *  ,Y~co) 


(3.3-5) 


is  the  interfrequency  system  correlation  function  for  the  cascade  of  the 
scattering  and  filter  Hi.  Separating  the  last  two  terms  of  Equation 
(3.3-4)  and  performing  two  integrals  over  frequency  for  all  terms  we 

have : 
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n(x.T 1 .t , t *)  - 

2 


(ij)  (  j  U^h)  (tfJ-KT-f)--  iH2(M")|2( 

..  -'ll 


60  (0 


[  |H1(o>)  I  Sn  n  (to)  +  $gl(to,(o,t-t ’)  sxx  (•»>)] 

[S  (u")  +  |H  (u")|2  S  (u")  J  ]  du  du"  ) 
^2^2  d  xx  ) 


(3.3-6) 


+  (  I  ♦alU>.u,.t-t’)  Hd2M*  Hd2(u’)  sxxM  sxx(u’) 


(0  CO 


(  e+JCOT-wV)  +  e+jW)  (t-tO-jCu'T-ut')  j  duduM 


Thus,  it  can  be  seen  that  when  the  assumptions  (3.3-3)  are  satisfied 
Equation  (3.2-4)  does  indeed  hold.  Finally,  from  Equation  (3.2-1)  and 
taking  y  ■  p-p ' )  we  have 


A(t , t ' ,T, p ,p ' )  -  4(t,t’ ,T,p-pr) 

«  Mt,t',T,m) 


(3.3-7) 


Substituting  Equation  (3.3-6)  into  (3.2-8)  we  may  rewrite  Equation 
(3.2-1)  in  summary  form: 
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where : 


hM  -  f*im  lHl<“>|2  Sn  BlW  27 

to 

x2(v)  =  J  eJ“V  ♦sl(u,u,w)  SaM  If 


I3(v) 


je^  |H2(U)|2  [8  M  +  |Hd(.)|2  Sxx(U»  f* 


,v) 


II; 

to  10 


;j  ( (DT-w'  T  ' )  #Bi((1)fWtfV)  HJ0Cw)*  H^(to') 


d2 

S  (to)  S  (to’) 


XX 


XX 


d2 

dto  dto* 
2  TT  2ti 


I5(t,t')  «  H(t,T,p)  H(t  1 ,T,p ' )  ■ 

[  I  H  (to)  H  (to) *  H.  (to)  H,(u)*  e+Ja)T  S  (to)  “  ] 

J  C  Cl  i  L  XX  7T 

(Jj 

[  j  Hc(to')*  Hd(to’)  H1(to’)*  K2((o’)  e“Ja,,T'  S^to')  ] 
to’ 


(3.3-9) 


(3.3-10) 


(3.3-11) 


(3.3-12) 


(3.3-13) 
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3 . 4  Two  Receiver  Array  Croco-correlator  Processing 

In  this  section  we  apply  the  same  technique  used  In  section  3.1  to 
the  problem  of  two  channel  array  cross-correlation.  In  this  case  the 
propagation  geometry  for  the  two  channels  is  assumed  similar  in  nature 
and  statistically  correlated.  This  geometry  might  consist  only  of  a  pair 
of  surface  reflected  paths  (see  Figure  1.0-lb),  or  for  a  slight  increase 
in  complexity,  one  might  include  direct  transmission  (Figure  1.0-lc). 

The  geometry  is,  in  fact,  arbitrary  for  the  purposes  of  this  section  with 
the  exception  of  the  assumption  of  statistical  symmetry  between  the  two 
channels.  A  block  diagram  for  this  processor  is  shown  in  Figure  3.4-1. 


Two  Receiver  Array 
Cross-correlator 

Figure  3.4-1 

As  a  special  cast,  when  the  two  receivers  are  drawn  together  so  that  they 
coalesce,  5  becomes  an  auto-correlation  estimate. 

For  the  analysis  of  this  section  n^(t)  and  ^(t)  are  not  assumed 
to  be  independent.  Wc  define  the  cascade  responses 
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Hlf (w,t) 


2tt 

“jwt 


H2f((D,t)  -  — 2 


J" Hf(y)  e^Yt  dg^w.y-w) 
~  f Hf(y)  e^Yt  dg2(ojly-w) 


Following  Equation  (3.1-4)  we  compute  the  mean  of  E 


SCt.T.p) 


P+T 

If  {±-  f 

T  J  12tt  J 


[Hf(o)>  dznl(uj)  +  Hlf(cj,t)  dzx(a))]  exp(jwt)} 


w 


(3.4-1) 


(3.4-2) 


(3.4-3) 


J  [Hf  (to ' )  dzn2  (to ' )  +  H2f(w,,t-i)  dzx(a)*)]  exp{jw' (t-t) }  }*  dt 

u»' 

P+T  2 

MtJ  {(2^)  /  /  [Hf(w)  [Hf(w»)*  dznl(u)  dzn2(w')*  + 

p  U)  U)’ 


H2f(u)',t-T)*  dznl(u»)  dzx(w')*]  +  Hlf(w,t)  [Hf (u)' )*  dzx(w)  dzn2(w,)*J 


+  HJf(u),t)  H2f  (a)  *  ,t-i)*  dzx(ai)  dzx(u>')*J  exp{j  (u)-u>’)t+jw’T}}*  dt 


Once  again,  we  assume  that  the  signal  x(t)  is  independent  from  the 
background  noise  in  each  of  the  two  channels.  Also,  we  assume  wide- 
sense  stationary  signals  and  noises,  and  wide-sense  stationary  (WSS) 
scattering.  Allowing  for  the  dependence  of  n^(t)  and  n2(t)  we  have 


H(t,T,p) 


(3.4-4) 


t  [  |Hf (<i>)  | 2  dz^M  +  dzxxM)  1 


U) 
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where  the  parallel  system  cross-correlation  function  for  the  series 
cascade  of  scattering  and  filters  is  obtained  in  the  same  manner  as 
Equation  (2.4-36): 

?  «  H  (u, t)  H  (w,t-T)*  = 

V f  i£  (3.4-5) 

e“JwT  (  Iyt 

— 2tT~  J  Kf^  dA12(w,u>,Y-w) 

Y 
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3 . 5  Cross-correlator  Fluctuation  for  the  Two  Receiver  Array 


We  now  apply  the  analysis  of  section  3.3  to  the  problem  of  describing 
the  correlator  fluctuation  for  the  two  receiver  array.  The  equivalent  of 
Equation  (3.3-lb)  for  this  case  is 

n(T,T',t,t’)  - 


*>*■  II 1 1 

,,i  '.ii  ,,im 
00  U)  CO  CO 


j  (who")  t-j  (to  »  W  ’  )  t '  -j  (lii"T— to"  ’  T  ’  ) 


[  Hf (to)  dznl(w)  +  Hlf(co,t)  dzx(oo)  ] 


[  Hf(to')  dznl(to')  +  HjLf  (u)l  »t ')  dzx(col)  ]* 


(  Hf(to")  dzn2(co")  +  H2f(co",t-T)  dzx(co")  ] 


(3.5-1) 


[  Hf(u)"»)  dzn2(u)"')  +  H2f(u)"',t’-T»)  dzx(W"')  ]* 

Once  again,  on  expanding  this  product  we  obtain  sixteen  terms.  For  this 
case,  however,  we  are  assuming  that  n^(t)  and  n2(t)  are  partially 
correlated.  Nevertheless,  the  signal  x(t)  remains  independent  of  the 
noises.  Therefore,  of  the  sixteen  terras  in  the  product  all  but  eight  of 
these  are  still  found  to  be  zero.  Equation  (3.5-1)  becomes 
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J  (w+wH)  t-j  (u '  W  * )  t  ’  - j  (w"t 

[Hf (cu)  Hf (w')*  Hf(u>")  Hf (wn '  )* 

dznl(w)  dznl(u»')*  dzn2(w")  dzn2(w,M)*  )  + 

[Hf (w)  Hf (oj*)*  H2f (wn» t-t)  H2f(a),",t,-T')* 

dznl(w)  dznl(aj’)*  dzx(w")  dzx(aj" * ) *  ]  + 

(Hf (w)  Hf(u>")  H'2f(a)H '  , t'-T1)* 

dznl(u>)  dzn2(aj")  dzx(w')*  dzx(w"')*  ]  + 

[Hf (oj)  Hf (u)'")*  H2f(u)",t-T) 

dznl(w)  dzn2(aj"')*  dzx(aj')*  dzx(w")  ]  + 
[Hf(«'>*  Hf (a)")  H^(a),t)  H2f(u,,,it,-T')* 

dzni(w’)*  dzn2(ui")  dzx(u>)  dzx(u'")*  ]  + 

[Hf (to* )*  Hf(a),M)*  H^f(u),t)  H2f(u",t-T) 

dZni(w*)*  dz^2 (u)1' 1 ) *  dzx(w)  dz*(u")  ]  + 
[Hf(w")  Hf(a)’")*  Hlf(u,t)  Hlf(L>’,t')* 

dZn2<U) ')  dzn2^u)"’^*  dzx(w)  dzx(w’>*  ]  + 

dz  (10)  dz  (w1)*  dz  (wh)  dz  (a>*'  * )  *  ]  } 

XX  XX 


U),,fT’) 


(3.5-2) 
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We  now  make  the  following  assumptions: 

(1)  n^(t)  and  ^(t)  are  jointly  stationary  and 
Gaussian. 

(2)  The  signal  x(t)  emitted  at  the  target  is  a 
stationary  Gaussian  process. 

(3)  The  systems  H.(w,t)  and  H*(o),t)  belong  to  i 
class  of  cross-fourth-order-interrrequency  station¬ 
ary  systems  (CFOIS)  satisfying  the  relation 

<*>1^2^W,U,,  *  t.t-ii.t-v'.t-v")  - 


)  (3.5-3) 


H,(w,t)  K.tu'.t-u)*  H,(u",t-v')  H,(w"\t-li")* 


(3.5-4) 


In  general,  the  fourth  order  system  correlation  function  for  the  cascade 
of  scattering  and  filters  may  be  obtained  as  in  Equation  (2.4-41)  from 
the  definitions  (3.5-4),  (3.4-1)  and  (3.4-2): 


A* ]  ...  ....  u  u,  u„v  .  “J (w* w-u)MP ' W ' y") 

v.  2  (<*>,<*>  ,u  ,oj  ,y,u  ,y  ;  ■  e 
f  f 


III 

'  t  ^  ■  a  »  »»  ■ 


+j  (w*  u '  -o)Mu"+a)M '  u"  ’ ) 


(3.5-5) 


u'  u"  V" 


T[4](w-a,,W-u),,,,p,u,,p,,,u,,u,,>u,M)  do’  du,;  do"' 


where  the  appropriate  extension  of  Equation  (2.4-42)  is 


T[4](Y,P,P\UM,U\U",U'M)  - 


(3.5-6) 


Je+^°  h^(o)  h^a-y+u’)  h^(o-y  f+u")  h^(o-y"+u,M)  do 
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When  assumptions  (3.5-3)  hold,  using  (2.1-18),  (2.1-23),  (2.1-16)  and 
(2.1-17)  we  have 


n(x,T',t,t’) 

<1# 


j  (urt-to")  t-j  (w '+□" ' )  t  ’  - j  (u>"t -to" ' t  ' ) 


w  w'  V'  to"' 


Hf (oj)  Hf (taf )*  Hf(u>")  Hf(to"’)*  [ 


S  (to)  6  (to— to  * )  S  (to")  6(w"-to,M)  + 
nl"l  n2n2 


S  (oj)  6(<i>to")  S  (u,M)  6(u"'+u’)  + 

nin2  nin2 


S  (u)  5 (cj—uj" ' )  S  (to1)  6(to'-to")  ] 
nln2  n2nl 


|Hf(u>)|  4>2f  (w‘^w",t-t,-T+T,) 


S  (to)  6 (oj— u> * )  S  (to")  6(to"-w"') 
n^n^  xx 


+  |nf(u)  1 2  2 

£  £ 


S  (oj)  6(urt-to")  S  (a)"')  6(to’W") 
n^2  xx 


+  |Hf (w) |  .  (to' ,to* , t-t '-t) 

f  f 


S  n  (to)  6(to-u)M')  S  (to*)  6(w"-to') 


+  |Hf(to*)|2  ?  (to, to, t-t  *+T  ') 

Tf 


S  (to’)  6(to'-to")  S  (w)  6(to-to,M) 
n2n^  xx 


+  |H.(to"’)  -  ( to, u,  — t )  * 

f  f 


S  (to"’)  6(u,+toM ')  S  (to)  6(uH-toM) 
n1n2  xx 


(3.5-7) 
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+  | Hf  (a>")  | 2 


S„  „  (u")  6(w"-ui"')  S  (u)  S(u-u’) 
n  .^n f)  xx 


f  AT 

+  4>!  I  (o)><i>,fOI,,,U,M  tt-t' .T.t-t'+T*)  [ 

Af  f 

S  (u)  6 (ui—tu * )  S  (a)n)  6(a)"-o)!i ' )  + 


S(w)  6(a>W)  S  (ai"')  6(a),,,+w»)  + 

XX  XX 


S  (u)  6W)  S  (u")  «(w'-u”)  ] 

Xvv  XX 


dw  dw'  da)"  da)"' 


Upon  rearranging  terms  and  performing  two  integrals  over  frequency  for 
all  terms  we  have 

2 


n (x  , x '  ,t,t')  *  ( 


fe>  -  /  / 


iCorfu")  (t-*t’)-j(T-T,)u)M 


0)  0) 


II 


|M“)  |2  (|Hf(u")|2  s  (y)  s  (»”) 
r  1  nlnl  n2n2 


+  *2f(<AtAt-t’-T+T’)  S  (o>)  Sxx(0)")  ] 


+  *  (w.oj.t-t')  |H  (u>")  1 2  n  (U)*')  S  (o>) 

it  r  n2n2  305 


+  (o),u),a)"  ,w"  ,t-t '  .Tjt-t'+T’)  S  ((d)  S  (id")  |  da)  do)"  } 

XX  XX  I 


+  ( 


fe-// 

^  ill 
0)  0) 


~j  (-0)1-0)'  't  ’) 


|H  (u)  |2  [|Hf(u,'")|2  S  „  (u)  S  n  («•••) 


nln2  nln2 


+  .  ,."V>  s  W>  SXX(U"')  ) 
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+  ,  (w,w,+t  )  |Hf  \  sn  (U)'M)  S  (u>) 

!.£*£  t  '  n]^n2  XX 


+  *lj2  ‘-‘’.T.t-t'+T’)  SXX(U'")  J  du>  d<o"’  ) 


+  ( 


fc>!  '  /  / 


1+j((J+w’)  (t  — t*)— J  (u}'  T— 0>T  *  ) 


U)  0) 


|Hf(w)|2  [  |H_ (oj* )  j 2  sn  n  (u)  sn  n  (u»*) 
'r  '  r  1  nin2  n2nl 


+  4u  1  (o)'  ,u)'  .t-t’-x)  S  (w)  S  (oj1)  ] 

Zfif  ni2  xx 

.m2 


+  ?  (u>,U),  t-t  '+T  1 )  |H  (w')|  S  (<*>’)  S  (a)) 

+  (w.to’  ,0)'  jT.t-t'+T')  S  (w)  S  (id1) 

X  r ^ ^  AA  AA 


dw  dw'  } 


(3.5-8) 


It  follows  again  that  the  assumptions  stated  imply  that  nCijT’.t.t') 
is  a  function  of  the  displacement  t-t*  .  Therefore,  the  results  of 
section  3.2  apply. 

Rewriting  Equation  (3.5-8)  in  summary  form  we  have  the  equivalent  of 
Equation  (3.3-8) 
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where  we  have  assumed  symmetry  between  the  two  channels,  defining  the 
quantities 


S  (to)  ■  S  (to)  ■  S  (w) 
nn  n2n2  ' 


*  (to,to,u)  »  *  (to,to,u)  -  0  (to,to,o) 

Tf  2f2f 


and  the  integrals 


Ma(U)  -j  |Hf(U)|2  Snn(U)  |f 

Ka(u)  -  j  el“u  ♦,(«.«.»)  Sxx(.)  |f 

Mc(u)  -/  |Hf(u)|2  S^to)  |f 

Kc(o)  -je}uU  l„M  |f 


(3.5-10) 


(3.5-11) 


(3.5-12) 


(3.5-13) 


(3.5-14) 


(3.5-15) 


J1(t,t' ,u) 


fij [toU+to' (u-t+t ') ) 


(3.5-16) 


[4] 

C1!  ?  (to, w, to’  ,w’  ,u,t,u+t’)  S  (u) 

XX 


C  /  IN  du  dto 

S  (to  )  -i—  - — 
xx  2v  2ir 


J2(t,t ' ,u) 


j(toT+toV) 


(3.5-17) 


^ 2  (to, -u'  ,-to,to’  ,u,t,u+t') 
Tf 


S 

xx 


(w)  s. 


xx 


(to’) 


dto  dto* 
2tt  2 it 
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VT 


.x-.u)  .  J  J, 

fa)  fa)' 


JlfaKu+T'Hw'Cu-T)] 


(3.5-18) 


(“•“''“'•“•u*T>u+T’)  Sxx(u)  Sxx(“,)  If  If" 


J4(t>T,)  ‘  (2^  1  lJ  *12£(“.«.’)  sxx(“>  eJ“T  d“  ) 

l  J  *1  2  (“'.“'.x')  S^Cia')  e^“  T  du'  )  ) 

-  K  (t)  K  (t’) 
c  c 


(3.5-19) 


-J 
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3.6  Correlator  Tracking  Error 


In  both  the  multipath  and  array  type  cross-correlators  of  Figures 
3.1-1  and  3.4-1  a  target  is  detected  when  the  appropriate  signal  replica¬ 
tion  produces  a  peak  of  the  output  S  which  is  reliably  discernible 
above  the  estimation  noise  (see  Figure  3.6-la).  This  peak  of  S(i,T,p) 
occurs  in  the  vicinity  of  the  value  of  the  replication  delay  for 
any  given  value  of  the  "read-out"  time  p+T  .  Since  S  is  a  random 
variable  the  position  of  the  peak  fluctuates  around  the  true  value  to  . 
Once  the  target  is  acquired  in  the  display  of  S  as  a  function  of  t  it 
is  usually  desirable  to  obtain  and  track  the  peak  with  increased  accuracy. 


5  and  5’  as  a  Function  of  t 
For  a  Given  p 

Figure  3.6-1 

A  suitable  method  of  measuring  the  accuracy  of  this  scheme  of 
estimating  is  to  obtain  the  variance  of  the  location  of  the  main 

peak  in  5  .60»61>62  This  is  equivalent  to  obtaining  the  variance  of  the 
corresponding  zero  crossing  in  the  derivative  of  E 

H'(t,T,p)  ■  ~  E(i,T,p)  (3.6-1) 
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This  assumes  that  the  derivative  exists  in  the  vicinity  of  this  null. 

Furthermore ,  it  is  assumed  that  only  one  isolated  null  exists  near  t  . 

o 

This  in  turn  requires  appropriate  limitations  on  the  bandwidth  of  the 
signals  and  noises  (which  are  normally  satisfied) . 

Asymptotically  with  large  T  we  expect  the  variance  of  the  zero 
crossing  to  approach  the  variance  of  E’  divided  by  the  mean  slope  of 
S'  at  the  null  £  squared.' 


To  be  valid  this  formula  requires  T  to  be  large  enough  so  that  the 
fluctuation  of  the  zero  crossing  is  smaller  than  the  width  of  the 
correlation  peak.  This  permits  the  linearization  implied  by  Equation 
(3.6-2)  (see  Figure  3.6-1). 

By  entirely  similar  reasoning  we  may  consider  the  zero-crossing  to 
be  a  function  of  p  and  compute  an  auto-correlation  function  for  this 
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random  variable: 


Rt  (w)  ♦ 


a 

_ 3t  3t 1 

82£(t.T.p) 
3t2 


T»T 


(3.6-3) 


This  quantity  measures  the  perslstance  of  the  tracking  errors  in  the  same 
way  that  A  measures  the  persistence  of  fluctuations  in  E  . 
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3.7  Concerning  the  Departure  of  the  Statistics  of  the 


Scattered  Signals  from  the  Gaussian 

A  key  assumption  made  in  sections  3.3  and  3.5  is  that  the  target 
signal  x(t)  is  Gaussian,  i.e.  its  time  samples  are  statistically 
described  by  the  probability  density  Equation  (2.x-ll).  After  passage 
of  such  a  signal  through  a  random  scattering  system  the  joint  probability 
density  of  the  scattered  signal  samples  is  usually  no  longer  Gaussian. 
Making  certain  assumptions  about  convergence,  the  joint  density  for  N 
samples  of  a  non-Gaussian  process  can  be  written  in  terms  of  an  Edgeworth 
expansion^"*  as  follows: 


f<y1»y2»*-‘»yN> 


CO 


exp  {^T  (- 
v>3 


1)V  < 


VlV*,VN 


N 


i-1 


(«*.,  )  ]  )  e(y1,y2,...,yM)  (3.7-1) 


(vi)l  '3yi 


N 


where  all  possible  permutations  of  the  are  taken  subject  to 

N 

v  -  E  v .  (3.7-2) 

i-1  1 

and  where  6(y,  ,y„ , . . .  ,yM)  is  the  Gaussian  density  given  by  Equation 

X  i  N  — 

T 

(2.1-11)  using  the  correlation  matrix  R  - 

The  constants  tc  are  termed  the  cumulants  for  the  joint 

\J  \)  .  -  -  \) 

12  N 

density  f  and  the  parameter  v  denotes  the  order  of  the  cumulants. 
Cumulants  of  order  1  are  defined  as  the  means  of  the  y  (these  are 
zero  in  the  present  context).  Cumulants  of  order  2  are  correlations 
between  samples  and  are  elements  of  the  matrix  R  .  Cumulants  of  order 
3  or  higher  can  be  obtained  as  in  the  case  of  cne-dimensional  probability 
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(3-29) 


densities  from  the  various  moments: 


N 

■»  i2...t  ■  /  /  •••  /  n  ^  £(yi,y2 . v  ^ 


yl  y2 


yN  k-1 


dyN 

(3.7-3) 


This  is  accomplished  by  forming  the  characteristic  function  corresponding 
to  Equation  (3.7-1): 


Q(u1,u2,...,uN)  - 


«  N 

1  +  E  v2...,Nrfi<7j>r  <n>  > 

1>1  A  A  w  k-1  K 


(3.7-4) 


N 


eXP  'L  \vj.-VL,  I  ,(vT)T  1 

v>i  i®:. 


where  all  permutations  of  the  and  are  taken  subject  to  the 

constraints  Equation  (3.7-2)  and 


i  ■ 


N 

E  li 

k»l 


(3.7-5) 


By  taking  the  log  of  Equation  (3.7-4)  and  equating  coefficients  of  equal 
products  of  the  u^  the  relations  giving  the  curaulants  in  terms  of  the 
moments  are  obtained. 

When  all  curaulants  of  order  3  or  greater  are  zero  the  density  f 
reduces  to  the  Gaussian  function  0  .  Hence  the  cumulants  of  order  >_  3 
measure  the  departure  of  f  from  6  .  Because  scattering  is  modeled 
here  as  a  linear  process,  the  odd  order  moments  and  cumulants  of  f  are 
zero  due  to  Equation  (2.1-14)  and  kindred  relations.  Therefore,  the 
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fourth  order  cumulants  are  the  first  to  produce  non-Gaussian  behavior. 
The  next  group  to  contribute  are  cumulants  of  sixth  order. 

We  consider  here  only  the  nature  of  the  fourth  order  cumulant.  It 
suffices  to  consider  a  four  sample  density  (N  ■  A)  in  order  to  obtain 
the  general  form  of  the  relation  between  the  cumulants  and  the  moments. 
We  have  from  Equation  (3.7-4) 

*1111  "  ^1111  "  w0011  v1100  “  p0101  u1010  “  w1001  y0110  (3.7-6) 

Forcing  the  two  channels  in  Figure  3.4-1  to  be  the  same,  particularizing 
K^(co)  to  be  1  ,  and  taking  the  four  camples  at  times  t,  t',  t-T,  t'-i* 
we  have 

yllll  “  (a)  (3.7-7) 

Mnoo  "  H(t-t '  ,T,p)  (b) 

P0011  -  -(t**t*— t+t*  ,T,p)  (c) 

“1010  “  H(t-T>p)  (d) 

Pqioi  *  H(t',T,p)  (e) 

M1001  *  5(t”t,+T’»T»P)  (O 


w0110  *  H(t '-t+x ,T,p)*  (g) 

Substituting  these  into  Equation  (3.7-6)  and  using  Equation  (3.5-8)  we 
have 
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The  integrals  for  J. ,  J,,,  J,,  and  K  are  evaluated  for  various  models 

i  z  j  a 

for  scattering  in  the  remaining  chapters  of  this  report.  In  general,  the 
result  for  k,,  is  found  to  be  non-zero.  The  scattered  signal 

iix  J. 

is  therefore  a  non-Gaussian  process. 
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3.8  The  Two-Sided  Likelihood  Ratio  Decision  Scheme_Ji?r  the 


Correlator  Detector 

It  is  clear  from  the  discussion  of  section  3.7  that  since  the 
probability  density  for  the  scattered  signal  is  generally  non-Gaussian, 
joint  statistics  for  a  two  channel  array  or  multipath  configurations  are 
also  non-Gaussian.  Evidence  presented  in  the  remainder  of  this  report 
Indicates  the  importance  of  taking  into  account  this  departure  from 
Gaussian  statistics  in  describing  the  scattered  signals.  The  design  of 
optimal  detectors  for  such  signals  is  not  a  trivial  task,  and  we  shall  not 
undertake  this  problem. 

In  lieu  of  a  formal  attack  on  the  optimal  detector  design  problem  we 
regard  the  detector  structures  of  Figures  3.1-1  and  3.4-1  as  essentially 
fixed  and  inquire  how  best  to  decide  whether  a  target  is  present  from  the 
output  H(t,T,p)  .  We  begin  by  examining  the  case  for  which  the  replica¬ 
tion  delay  parameter  t  is  fixed  or  "steered"  to  the  correct  value  tq 
given  that  a  target  is  present.  We  call  this  the  "on-target"  detection 
problem  since  we  effectively  evaluate  the  ability  of  the  detector  to 
function  properly  while  it  is  "looking"  directly  at  the  target. 

Given  the  correlator  output  E(to,T,p)  one  must  decide  between  two 
hypotheses: 

H  :  The  target  is  not  present. 

(The  correlator  output  is  due  to  noise  only.) 

H.:  The  target  is  present. 

(The  correlator  output  contains  a  component 
due  to  the  presence  of  signal.) 

We  assume  that  the  correlator  integration  interval  T  is  large  enough 

68 

so  that  E(t  ,T,p)  tends  to  be  a  Gaussian  random  variable  under  either 
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hypothesis.  We  then  have  the  following  probability  densities  for 


v 

fn(=)  “  _  exp{-Jj(c  -  s  )  la  } 

u  V  2ir  o0 

(3.8-1) 

Hr 

M=)  ■  _  -  exp{-ij(=  -  S.)  /o.) 

V  2it  ox  11 

(3.8-2) 

As  a  practical  matter  E^  is  either  zero  or  approximately  zero  for 
most  problems  considered,  so  we  set  it  to  zero  here. 

Now  the  likelihood  ratio  procedure*^  for  deciding  between  the  two 
alternative  hypotheses  and  given  a  measured  value  for  5  is 
to  decide  in  favor  of  HQ  if 


(5) 

(S) 


(3.3-3) 


and  in  favor  of  if 


y=) 

oir<Kth 


(3.8-4) 


The  performance  of  such  a  decision  scheme  depends  on  the  choice  of  the 
threshold  value  K  and  on  the  parameters  o^,  E^,  and  o^  which  are 
determined  by  the  statistics  of  the  signals,  noises  and  scattering. 

These  latter  parameters  may  be  computed  from  Equations  (3.1-9)  and 
(3.3-8)  for  the  multipath  detector  or  (3.4-4)  and  (3.5-9)  for  the  two- 
element  array  detector. 

The  critical  values  of  E  for  which  equality  holds  in  Equation 
(3.8-3)  are  obtained  by  substituting  Equations  (3.8-1)  and  (3.8-2)  and 
taking  the  natural  logarithm  yielding 
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(3.8-5) 


:2  (JL  -  JL)  +  2  a  It 

'  1  2  2}  2 

°0  °1  °1 


-2 

H 

^1 

2 


-2  Ln 


The  solution  for  the  roots  of  this  quadratic  leads  to  the  critical 
values : 


.  S1  d0  H 
>  "  d2  2 
d0  dl 


Hq  -  i\)  in  ) 


1  +  2 


(3.8-6) 


where 


(3.8-7) 


are  the  standard  deviations  of  the  correlator  output  under  the  two 
hypotheses  and  normalized  by  the  mean  under  . 

In  general*  the  variance  is  greater  than  oQ  (see  Figure 

(3.8-1)  ).  The  special  case  of  d^  +  d^  in  Equation  (3.8-6)  for  which 
one  critical  value  tends  to  *  arises  only  in  the  case  of  very  weak 
signal-to-noise  ratios.  Therefore,  the  two  critical  values  divide 


Figure  3.8-1 
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the  range  -<*><  =  <»  into  three  decision  intervals: 

-®  <  E  <  E_  Decide  in  favor  of  . 

S__  <  S  <  S+  Decide  in  favor  of  ,  (3.8-8) 

E+  <  E  <  «  Decide  in  favor  of  . 

Given  any  threshold  value  and  the  parameters  d^  and  d^ 

one  may  evaluate  the  decision  scheme  Equation  (3.8-8)  by  computing  the 
probability  of  incorrect  decisions.  The  probability  «  of  committing 
a  type  I  error  (false  alarm) ^  is  equal  to  the  probability  of  E 
incorrectly  exceeding  one  of  the  critical  values  E+  : 

M 

+CO 

“  ■  J  fo(5)  d2  +  /  £o(2)  62 

—CO  5 

+ 


1  -  9{~”  }-  8{-^- 

2  fta„  ft, 


)  ] 


(3.8-9) 


i  -  i[  e( — ^ 


(: 


ft  (d2  -  d2]  1 


1  +  2  [d2  -  d2l  tn(  ^  -°)  )} 


-  8( 


.2  .2,  (52+,/1  +  2  ld2-d2]  *n  )  1 


tfldj-djl  “l 


where  8(2)  is  the  error  function^: 


(3.8-lu) 
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Similarly,  the  probability  6  of  a  type  II  error  or  false  dismissal  is 

determined  by  the  probability  that  5  incorrectly  lies  within  the 

critical  levels  H,  when  in  fact  a  target  is  present: 

+ 

M 

8  -  /  fj/H)  d= 


I[e{ _ 12. 

2  /?«.2 


t2[d  ■ 


* 


K 


£n(- 


th 

d. 


')  )) 


(3.8-11) 


-  0{ 


+  2 


(d0  ' 


$ 


K 


in  (• 


th  0 


)  ))] 


2  2 

In  general,  the  "on-target"  normalized  variances  d^  and  d^  are 
overly  pessimistic  indicators  of  the  performance  of  the  correlator 
detectors.  This  is  true  because  the  peak  of  measured  correlation 
E(x,T,p)  usually  does  not  occur  precisely  at  the  correct  location 
(see  section  3.6).  Consequently,  a  detector  which  initiates  a  search  for 
the  peak  in  the  vicinity  of  tq  performs  somewhat  better.  A  procedure 
for  evaluating  this  "search"  detector  for  certain  classes  of  signals  is 
discussed  in  Appendix  D. 


Finally  we  note  the  importance  of  using  the  two-sided  test  Equation 


(3.8-8)  in  cases  of  moderate  or  strong  scattering.  In  such  situations 
5^  may  be  quite  small  while  o^  may  be  much  greater  than  Oq  leading 
to  sign  changes  in  the  peak  of  correlation  (see  Figure  (3.8-1).  Failure 
to  use  the  two-sided  test  when  this  occurs  can  produce  sizable  losses  in 
detectability. 
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CHATTER  4 


•  THE  RANDOM  AMPLITUDE  AND  DELAY 
MODEL  FOR  SCATTERING 


4.0  Introduction 

In  this  chapter  surface  scattering  is  modeled  as  a  time-vary¬ 
ing  system  with  the  transfer  function 

H(u, t)  -  A(t)  e“JWT(t:)  (4.0-1) 

where  the  amplitude  A(t)  and  delay  r(t)  are  considered  independent, 

stationary,  Gaussian  random  variables.  This  simple  model  is  used  by 
73  7  A  75 

Price,  Green,  Turin,  and  others  in  studies  of  communication 
through  ionospheric  and  tropospheric  channels.  It  serves  as  an  ex¬ 
ample  which  produces  certain  effects  on  correlator  outputs  which  are 
typical  of  more  complex  and  realistic  models. 

The  mean  and  variance  of  the  outputs  from  multipath  and  array 
cross-correlators  are  computed  using  the  formulas  derived  in  Chapter 
3.'  It  is  shown  that  as  the  correlator  integration  time  is  increased 
from  zero  the  output  variance  at  first  decreases  rapidly.  However, 
at  the  point  where  reliable  correlation  begins  to  emerge  in  the 
absence  of  scattering  the  variance  passes  through  a  "settling '  period 
during  which  averaging  over  slowly  varying  scattering  fluctuations 
takes  place.  The  persistence  of  these  fluctuations  is  computed  and 
related  to  the  4-the  order  cumulant  of  the  scattered  sign«Tl. 
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4.1  First  And  Second  Order  Syitwa  Statistics  For 


The  Random  Amplitude  And  Delay  Model 

It  Is  assumed  that  the  amplitude  and  delay  in  (4.0-1)  have 
the  mean  values 

a)  A(t)  -  - - —  ■  A  b)  x  ( t)  ■  — - *  x  (4.1-1) 

C  8 

r0  +  r4  c 

where  the  distances  rQ  and  are  defined  in  Figure  1.4-1  and  nB  is 
a  constant  less  than  one.  We  define  the  correlation  functions 

•  a)  R^Cy)  *  A(t)  A(t-p)  b)  RtTCw)  *  x(t)x(t-y)-  t2  (4.2. 1-2) 

where  for  convenience  R^  is  defined  as  a  non-central  moment. 

Since  A(t)  and  T(t)  are  independent 

H(to,  t)  ■  H  (w)  ■  A(tT  e"^1^  »  A  Q  (1)  (u)  (4.1-3) 

c  c  T 

where  Q^(w)  is  the  one  dimensional  characteristic  function  for 
r(t).  From  the  assumed  stationarity  of  X,  (u>)  is  time  invariant. 

In  a  similar  manner, 

^(w^’.p)  «  H(o), t)  H(u>f  ,t-y)*  ■ 

A(t)  A(t-y)  exp{-ju)T(t)+ju)' x(t-y) )  ■  R^y)  (u>,-w‘  ,y)  (4.1-4) 

(2) 

where  (w  to '  >p)  is  the  characteristic  function  for  the  two- 

dimensional  probability  distribution  for  x.  Statior.arity  of  x  and  A 
therefore  guarantees  that  the  scattering  system  will  be  interfrequency  wide 
sense  stationary  (IV-SS).  However,  in  the  cose  of  ;(t)  the  requirement  is 
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for  stationarity  of  the  two  sample  distribution  whereas  A(t)  need 
only  be  wide  sense  stationary. 

Although  the  instantaneous  impulse  reponse  corresponding  to 
(4.0-1)  is  of  infinitesimal  duration, 

h(v,t)  -  A(t)  6 (v-t ( t) )  (4.1-5) 

the  average  or  coherent  response  is  of  finite  width  : 


hc(v)  -  J  eJ“v  HC(U)  -  Ac  £t(»>  (4.1-6) 

’v>.ere  f^Cv)  is  the  probability  density  for  t.  Immediate  corollaries 
are  obtained  from  (2.4-1)  and  (2.4-4) 

(a)  Cc(y,v)  -  Ac  fT(u-v)  b)  Rxy(y)  “  AcJ  Rxx(w“v)fT(v)  dv 

(4.1-7) 


Similarly, 


h(u,t)  h(u'.t-u)  -  J  |  ejUv-JU'v'  H(w>t)  H(u,(t.u)  ^  deal 

to  to*  2 ir  2  it 

-  R^Cu)  fT  (v,v\y)  (4.1-8) 

where  f^VjV'.y)  is  the  two  sample  probability  density  corresponding 
to  ((0,0)’ ,y) .  The  auto  decorrelation  function  is  also  readily 
found  from  (2.4-23) 

Dfl(p,v)  »  RM(y)  e  ejm[x(t)-T(t-M)J  cj(o(y+v)  ^ 

J  2rr 

to 


■  RAA(,1)  £6t  (,,+v>m) 


(4.1-9) 


4 
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where  ffiT(v,y)  is  the  probability  density  for  the  difference  v 
between  the  values  of  t  at  times  t  and  t-y.  It  follows  from 
(2.4-21)  that 

Ryy(y)  -  R^Cy)  J Rxx<y“v)  fdT(v*p^  dv  (4.1-10) 

y 

For  the  special  case  of  Gaussian  t  we  have 

(4.1-11) 
(4.1- 12) 

(4.1-13) 

(4.1-13) 


For  this  case  the  trifrequency  spectral  density  becomes 

m 

dA(o) ,0)* ,/)  exp{-*5  ((  w2+w,2)oJ)-j(u)-(D,)T  }  y  (iw')n  s  (u>") 

T  S  CmM  I  n 

dw'*  n-0 

where 

«n(“")  *  J e  RAA(y)  [RIT('l))n  d“ 

y 

Similarly,  applying  (2.4-20)  to  (2.4-17)  and  using  (4.1-14)  we 


(4.1-14) 


(4.1-15) 


2  2, 


Hc(w)  -  Ac  exp{-l/2(w  ot)  -  jwTg} 

4>(w,u>',y)-  R^y)  exp  {-l/2( (ti)2+u),2)o2-2o>a),Rir_r  (y) )- j (w-o)' )td) 


Dc(y,v) 


TT 


exp{-l/2(y-v-Tg)2/o2} 


✓2? 


D  (u.v)  -  R,.(u)  “»  {-l/2(R+v)2/toJ-RT(R)]  > 

a  aa  — - 


,47 
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obtain  the  power  spectrum  swearing  density  : 


dy 


/ 


w 


exp(jwv)  dA(a),q)ty) 

dy 


•o 


(A. 1-16) 


where  H^r\z)  Is  the  m-th  order  Hermite  polynomial.2^ 
In  order  to  interpret  (A.1-1A)  and  (A. 1-16)  let 


RAA<y)  "  al  expC-l/2^2^  +  Ac 
Rtt(w)  -  02  exp(-l/2viV) 


(A. 1-17) 


(A. 1-18) 


From  the  definition  of  sn(y)  in  (A. 1-15)  we  have 


Sn(Y) 


1  2  /ft2 

2  2  ~  ^  j 

_!l _  r  CTA  6  +  A2!  exp{  2y2/(nfiT)2)  (A.  1-18) 

^nfl  L  2  J 

T  Asr  +1 

T 


It  is  seen  that  the  frequency  smear  for  larger  values  of  n  is  wider. 

From  this  we  can  conclude  that  the  higher  order  and  more  intricate 
Hermite  functions  in  (A. 1-16)  represent  the  comDonents  of  scattering 
which  exhibit  the  greatest  fading.  From  equation  (A.1-1A)  it  can 
be  seen  that  these  components  are  excited  by  the  higher  frequency 
components  of  the  output.  The  wider  frequency  smear  at  higher  frequencies 
is  characteristic  of  delay  modulation. 
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4.2  Multipath  Correlator  Fluctuations  for 


the  Random  Amplitude  and  Dalay  Model 

Following  section  3.8  we  define  normalized  covariances 

dj(T,T,w)  _  f  (t»,t,T,p)  |h  j.o,l  (4.2-1) 

ISfy  T  p)  ]2 

where  ACt.TpTjp)  must  be  evaluated  under  hypothesis  HQ  (noise  only) 
and  (signal  plus  noise  present)  while  S(t,T,u)  Is  understood  to 
be  computed  for  We  compute  here  the  normalized  covariances  for  the 
configuration  shown  in  Figure  3.1-1  assuming  for  simplicity  that  the 
spectra  for  the  signal  and  background  noise  are  given  respectively  by 


a)  s]0t(u)  .  /2»Px  exp(V  /(l2) 

fl 


(4.2-2) 


and  2  2  2 

b)  9  .  .  p  -’*‘1  /fin,  p  -**>  '  02 

9  n,n,(<o)  _  nl  e  c)  S_  _  (w)_  n2  e 

11  .  2n2 


n 


n 


nl  n2 

Also  for  simplicity  we  assume  that  the  filters  H^(uj)  and  82(0))  are 


as  follows 


a)  Hjto)  -  e 


2  2 

/  fl. 


2  2 
/  flo 

b)  H„ (u)  ■  e  (4.2-3) 


Here  P^,  P^  ©*©  the  signal  and  noise  powers  respectively.  It 

is  assumed  that  the  bandwidths  02»  end  °n2  are  all  very 

large  compared  with  the  fading  bandwidths  of  the  functions  ©Q(y)  in 
(4.1-15)  so  that  equation  (2.5-5)may  be  used. 
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The  direct  path  transfer  function  is  written  as 


Hd(u)  •  Ad 


-jWT 

ft  d 


where  ■  R^/c  and  ■  1/R^ -  Using  these  definitions  (and  assuming  the 
statistics  for  A(t)  &  x(t)  discussed  in  section  4.1)  the  mean  output 
of  the  correlator  under  hypothesis  becomes 


E(t,T,p) 


A.  A  P 
d  c  x 


(?) 


1  2 
l  m  o 


(4.2-5) 


v/here  t 


t  -  t.  and 
s  d 


n 


m 


o+l 

T 

0 


2  n 


2  n; 


(4.2-6) 


The  mean  exhibits  a  peak  of  magnitude  A,A  P  ft  /fi  at  t»  t 

r  e  dcxm  x  o 

corresponding  to  the  true  target  laocation. 

Let  us  assume  that  R..(p)  and  R  (p)  are  as  given  in  (4.1-17) 

AA  TT 

and  (4.1-18).  Then  using  the  results  of  Appendix  E.l  we  have  the 
following  result  for  the  '  on-target1,  normalized  covariance  for  the 
null  hypothesis  IIq  : 


do<vT-u)  ” 


,2A2A2  p„  Px  nnl  W-T 


,2,2,2 

T  AdAc 


x 

p+  T 

f  (X- 


n2 

2 


v+pj  exp{  2  [ninl  +  R2n2^  dv 


(4.2-7) 
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Similarly,  we  have  the  “on-target1-  normalized  covariance  for  the 


hypothesis  : 


VV1**1)  - 


fi2/Q2 
x  m 

2  2 
t^a:a 

a  < 

— 1 

►V 

3 

t-* 

/fi2n2  \ 

vP  1 

x 

l0l  j 

nl 

M+T 


j  iT+v-y]  +  j  [T-v+y) 

y-T  y 


1  2  2 

exp{  ^lnl^ 


raa(v) 


exp' 


n 


/ 1  2[a2  -  R  (v)] 

/-j  +  T  XT' 

/  127 
lx 


r 

• 

-1 

I  2 

~  2_ 

i  “ 

—V 

IT 

— 0  +  2[o  i  (v) ] 

2  T  tt' 

__  IX 

l 

( 

J 

t(rKH 

x  nl 


O  3 

-l/2v  njn2 


.  .2,  f!  ,  '1/?y2nL  (4.2-8) 

+  Ad(-ii  e 

'  fi  ' 


raa<u)  Ad 


fi2  /“ 4  -  R2  (v) 

X  m  TT 


exp  J  - 


2  2 


v  -  R 

—  tn  tt 

9 


(vj 


I  Ac  Ad 

n2/n2 

x  m 


raa(v? 


2 

a  m 


/ 


15,  4  -  R2  (v) 
m  tt 


-1 


dv 


The  effective  bandwidthc.  ft,  ,  Q  , ,  and  0  are  defined  in 

lx  lnl  2n2 

Appendix  E.l  and 


a  -  1/fl 

m  m 


(A. 2-9) 
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We  now  restrict  our  attention  to  the  special  case  of 
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We  may  easily  compute  the  normalized  covariances  for  various  numerical 
examples,  but  exact  closed-fo~m  expressions  are  not  readily  obtained 
for  all  of  the  integrals  involved, 

2 

For  example,  Figure  4.2-1  displays  d^ (TQ,T,0)for  ■  0  and 

for  various  values  of  using  typical  values  for  other  relevant  param- 

2 

etere.  Figure  4.2-2  shows  d. (t  ,T,0)  with  o.  ■  0.5  while  a  is  varied. 

1  O  A  T 

In  either  example  it  is  seen  that  as  the  integration  time  T  is  increas¬ 
ed  from  0  normalized  variance  at  first  decreases  somewhat.  However, 
thi3  initial  progress  is  halted  just  when  reliable  correlation  begins 
to  emerge  for  certain  realizations  of  the  random  scattering  system.  * 

At  this  point  large  uncertainties  exist  in  the  correlator  output  which 
reflect  the  variability  of  the  scattering.  These  uncertainties  show 
no  reduction  with  increasing  T  until  T  becomes  comparable  with  the 
time  constants  of  the  scattering  mechanisms.  Once  T  is  increased 
beyond  this  point  the  normalized  Variance  again  decreases  rapidly. 

It  is  during  this  period  that  the  correlator  averages  over  different 
scattering  ensemble  members. 

The  covariance  d^(T0,T,y)  of  the  correlator  output  H(t,T,p)  at  two 
different  times  p  and  p  +  y  is  plotted  in  Figure  4.2-3. 
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It  can  be  seen  that  the  correlator  output  for  two  different  starting 
times  p  and  p+y  rapidly  becomes  decorrelated  for  small  T.  As  T  Is 
Increases  to  the  point  where  discemable  correlation  begins  to 
emerge,  the  fluctuations  of  the  output  become  extremely  persistent. 

A  second  sample  of  S  Initiated  at  p+y  will  not  exhibit  Independent 
fluctuations  since  the  scattering  systemhas  not  changed  substantially. 
Sample  correlator  outputs  in  these  two  regions  of  behavior  are  shovm 
in  Figure  A. 2-4.  For  this  range  of  T  the  fluctuation  of  the  correlator 
output  appears  to  be  localized  in  the  vicinity  of  xo<  This  may  be 
verified  by  considering  d^(x,T,y)  for  t  9*  x0  in  Equation  (4.2-8).  For 
much  larger  values  of  T  the  fluctuations  will  be  still  more  persistent 
but  smaller  magnitude  as  more  variations  of  the  scattering  are  included 
in  the  processing  interval. 

A  careful  examination  of  the  terms  in  Equation  (4.2-8)  shows  that 
most  important  contribution  to  the  output  variance  in  the 
region  of  persistent  fluctuation  arises  from  the  two  integrals 
I^(x,x',v)  and  I^(x,x')  in  (3,3-8).  The  magnitude  of  this  plateau 
of  uncertainty  for  the  model  we  are  considering  is  approximately 


(4.2-11) 


This  is  alsc  approximately  true  in  cases  of  high  to  moderate  signal 

9 

to  noise  ratios.  In  Figure  4.2-5  d^(xo»T,0)  is  plotted  for  various 

signal  to  noise  ratios  (P  /P  ) .  Only  in  the  case  of  very  weak  ratios 

XI  X 

2 

does  the  plateau  become  masked  by  noise.  In  Figure  4,2-5  dQ(xo,T,0) 
is  plotted  for  the  same  sequence  of  signal  to  noise  ratios.  In  roughly 
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the  same  range  of  T  that  (4.2-11)  is  valid  we  have  approximately 


d  (t  ,T,0)  -  1 
oo  r 


nx  \ 2 

\  '  a2*2 

d  c 


(4.2-12) 


Assuming  that  E  tends  to  be  Gaussian  we  may  use  these  curves 

2  2 

for  and  d^  to  compute  for  a  given  T  the  receiver  operating 
characteristic  (a  plot  of  a  vs.  0  parameterized  by  K^) .  Figure 
4.2-7  gives  one  such  curve  corresponding  to  the  circled  points  in 
Figures  4.2-  5and  4.2-6.  Also  given  for  comparison  is  the  related  curve  for 
the  case  of  no  scattering  (obtained  by  setting  o  -  a.  ■  0  and  A  *  1) . 

In  addition,  the  corresponding  curves  for  these  two  cases  are  plotted  for 
a  one  sided  detector. 

It  is  clear  that  with  the  choice  of  parameter  values  selected 
for  these  curves  the  scattering  significantly  increases  the  false 
dismissal  probability  3  for  ar.y  fixed  false  alarm  probability  a. 

It  can  also  be  seen  that  under  the  assumption  that  E  is  Gaussian  the 
one  sided  detector  produces  still  larger  values  of  3.  Furthermore, 
these  latter  increases  result  in  a  receiver  operating  characteristic 
which  is  not  everywhere  convex  upward.  This  is  clear  manifestation 
that  the  one  sided  test  departs  markedly  from  the  likelihood 
strategy  in  such  situations.  However,  it  is  important  to  stress 
that  this  result  is  due  to  the  assumption  of  symmetry  in  fQ(E)  an<* 
f^(S)  under  the  hypothesis  that  these  functions  are  Gaussian.  Given 
that  A(t)is  itself  Gaussian  this  hypothesis  is  not  unreasonable. 

We  turn  briefly  to  the  question  of  parameter  optimization  in 
the  adopted  receiver  design.  As  might  be  expected,  the  rather 
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simplistic  model  of  (4.0-1)  presents  us  with  few  alternatives.  For 
example,  it  is  clear  that  there  is  nothing  one  can  do  to  filter  out 
the  effect  the  slow  amplitude  variations  A(t).  This  might  not  be 
the  case  were  we  to  assume  A(t)  to  be  frequency  dependent,  but  we 
choose  to  ignore  such  effects  in  this  chapter.  On  the  other  hand, 
assuming  that  our  purpose  is  to  improve  “on-target*  detectability, 
it  is  clear  that  the  plateau  of  normalized  variance  in  (4.2-11)  can 
be  reduced  by  decreasing  the  working  bandwidth  until  >>  o^. 

This  strategy  effectively  screens  out  the  incoherent  delay  modulated 
signal  fluctuations.  The  receiver  operating  characteristics  of 
Figure  4.2-8  demonstrate  ,  however,  that  if  0^,  is  made  too  small, 
performance  can  actually  worsen.  Figure  4.2-9  illustrates  how  in 
practice  the  plateau  of  Equation  (4.2-11)  can  become  masked  by  the 

other  terms  in  Equation  (4.2-8)  before  the  theoretical  minimum  of 

2  2  2 

a. /A  can  be  achieved  for  d. . 

AC  JL 

Within  the  limitation  just  described  it  is  seen  that  the  strategy 
of  screening  cut  incoherent  signal  definitely  leads  to  an  improvement 
of  detectability  when  the  receiver  is  :onstrained  to  be  steered 
on  target.  When  this  constraint  is  removed  (see  Appendix  D)  a  certain 
information  content  is  found  to  exist  in  the  incoherent  scattered 
signal  which  is  due  to  the  non-gaussian  nature  of  this  signal.  The 
unconstrained  receiver  improves  performance  by  attempting  to  estimate 
the  location  of  the  delay  modulated  peak  of  correlation.  The  magni¬ 
tude  of  this  peak  is  always  greater  than  or  equal  to  the  magnitude 

of  H  at  t  . 

o 
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Although  some  of  the  Incoherent  scattered  signal  contains 
information  useful  for  detection  purposes,  clearly  none  of  it  is  use¬ 
ful  for  tracking  in  this  case.  The  covariance  R  (y)  of  the  peak 

o 

location  estimate  is  obviously  a  suitable,  straightforward  measure  of 
the  magnitude  and  persistence  of  treking  errors.  By  applying  the 
results  of  Appendix  E.l  to  Equation  (3.6-3)  ve  have 

y  y+T 


-  ,  .  ,fl3  2 

rt  (u)  .  (  x/  m) 

0  2  2  2 
t  a:az 
d  c 
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(A. 2-13) 
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]  (-1) 


Ad  raa(v) 


nuU  -  R2  (v))3 

X  m  TT 
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TT(V>  + 


exp(-[a2  -  R^Cv)]'1  v2)<Rtt<v)-  v2[o2  +  R„(v))2)  J  ) 


As  can  be  seen  from  equation  4.2-13  the  covariance  of  the  tracking 
error  exhibits  a  plateau  of  uncertainty  of  approximately 
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(4.2-U) 


c 

which  is  also  reduced  by  increasing  until  it  is  much  greater 

2  2  2 

than  ot  yielding  a  minimum  of  (1  +  a^/Ac)  0^  . 
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4.3  Second  and  Fourth  Order  Cross  System  Statistics 


for  the  Two  Channel  Random  Amplitude  and  Delay  Model 


In  this  section  we  consider  some  of  the  joint  statistics  for 
the  pair  of  channels  described  by  the  random  amplitude  and  delay 
system  functions  : 

-jwi  (t)  -jWT«(t) 

H;l(u)s t)  -  Ax(t)  e  1  ;  H2(u>  ,t)  -  A2(t)  e  (4.3-1) 

where  A^(t)  and  x^(t)  are  considered  jointly  independent,  as  are 
A2(t)  and  x2(t).  However,  A^(t)  and  A2(t)  are  assumed  jointly  de¬ 
pendent,  as  are  x^(t)  and  x2(t).  All  four  variables  are  taken  to 
be  jointly  Gaussian  with  the  means 

A^t)  -  A2(t)  -  Ac  ;  x1(t)  =  xgl  ;  T~2(tj  -  xg2  (4.3-2) 


Ue  again  define  correlation  functions  for  these  parameters  : 


RAA(p)  “  A^(t)  A1(t-p)  -  A2(t)  A2(t-y) 

(4.3-3) 

rtt(w)  =  T^t)  T;L<t:-p)  -  -  T2(t)  T2(t-p)  -  T-2 

(4.3-4) 

P-Ac(y)  “  A^(t)  A2(t-y) 

(4.3-5) 

Ric(l,)  =  Tl(t)  T2(t-u)  '  T3lTs2 

(4.3-6) 

Once  again  F^  and  R^c  are  defined  as  non-central  moments  for  con¬ 
venience.  Assuming  stationarity  for  these  parameters  we  have  the 
following  system  correlation  functions  : 


4>  (a),u)’ ,ii)=RAA(y)  exp{-l/2[  (u)2+u)'2)o2  -2wu;,P  (y)  ]-j  (w-w' )x  >(4.3-7) 

jj  aa  xa  xa  5j 
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for  j  ■  1  and  2,  and 

fl>  0(u)vw'  ,u)“  R.  (y)  exp{- 1/2[((x)2+u)'2)o2  -2u)U)'R  (y)] 

A.  T  TC 

c  a 


(A. 3-8) 


-J<“TBr",Ts2)) 

2 

where  o  «  R^CO) .  Obviously,  the  second  order  stationarity  of 
relations  (4.3-3)  through  (4.3-6)  results  in  interfrequency  wide 
sense  cross  stationary  channels  (IVJSCS) .  This  taken  together  with  the 
assumption  that  the  amplitudes  and  delays  are  Gaussian  processes 
implies  that  the  two  channels  are  also  jointly  cross  fourth  order 
interfrequency  stationary  (CFOIS) .  Thus, 


[A] 


[4]  (u),w' ’  ,y,y,V')  *  R:  :  (£)exp{-l/2w  R  (jOco-  to  is  ) 
•  1,2  A1A  2  “1*2 


(4.3-9) 
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As  special  cases  ve  have 

(“»“»“*  »«*  »V,T,V+T*)  -  (v,T,V+Tf)  X 

2  12  2  (4.3-15) 

exp{-l/2[a11(v,TfT,)w  +2a12(v,T,TMu>u>,+a13(v,T>T,)u’  )} 

*1^2  »"“»“*  ,v,t,v+ti)  -  (v,t,v+t’)  x 

expf-l^a^v.T.T'^^a^^T.T^dXD'+a^v,!,!’)^2]}  x  (4.3-16) 

exp{-j(w+u)')  (tgl-T  s2)} 

[4]  [AJ 

4>  0  (u,a)'  ^'w.v.t.v+t’)  ■  R  (v,T,v+Tf)  x 

A»z  A  A 
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exp{-l/2[a31(vfT,T,)aj2»-2a32(v,T,T,)wa3,+a33(v,T,T,)(D  2]}  x  (4.3-17) 

exp{-j(w-w’)(Tgl-TB2)} 


where 

an(V,T,T')  -  2[RTfl(0)  -  RTfl(v))  (4.3-18) 

2a12(v,T,T')  -  2(rtc(t)  +  rtc(t')3  (4.3-19) 

-  2[Rtc(t-v)  +  Rxc  (v+x * ) ] 

a13(v,x,T')  »  2[RTfl(0)  -  RTa(T-v-Tf)]  (4.3-20) 

a91(v,T,T')  »  2 (R  (0)  -  R  (t) ]  (4.3-21) 

lx  ia  tc 

2a99(v,i ,t  ')  -  2[R  (v)+R  (t-v-t ’ ) ] 

44  Tu  Ttt 

-  2[R  (v+t')  +  R  (t-v)3  (A. 3-22) 

tc  xc 
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o23(v,t,t')  -  2[F.ta(0)  -  Rt,/-t,)J 


a31(v,T>T')  -  2[RTfl(0)  -  Rtc(v+t’)] 


2a32(v,T,T’)  -  -2[RTa(T-v-i')] 


(4.3-23) 

(4.3-24) 

(4.3-25) 


+  2[R  (t)  +  R  (x')] 

TC  TC  J 


a33^V,T ,T')  “  2^Rxa^  ”  RXC(T’V)] 


(4.3-26 


These  latter  parameters  satisfy  the  symmetry  relations 


°21(w»to*to)  ’  °23<v*to*To) 


(4.3-27) 

(4.3-28) 


Finally,  under  the  Gaussian  hypothesis  for  and  we  have 


(V,T,V+T»)  = 

A1A2 


[Ra  (t)  Ra  (t')  +  Ra  (t)  Ra  (v+t'-t)  +  R  (v4t«)  RAc(v-t)]  . 
c  c  a  a 
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(4.3-29) 
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4.4  Array  Correlator  Fluctuations  for  the 

Two  Channel  Random  Amplitude  and  Delay  Model 


Once  again  we  compute  the  normalized  output  variances  as  in 
section  4.2.  We  consider  the  spectra  (4.2-2)  and  filters  (4.2-3) 
Bubject  to  the  symmetry  assumptions  (4.2-10)  and  introduce  the  noise 
cross  spectral  density 
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The  mean  output  of  the  correlator  is  then 
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Assuming  that  T0fi£nc  >>  1  this  mean  exhibits  a  peak  at  of 
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Thus,  to  this  level  of  approximation  using  the  results  of  Appendix  E.2 
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As  in  the  case  of  the  multipath  processor  the  level  of  uncertainty 

decreases  as  the  bandwidth  of  the  processed  signal  decreases.  Whereas 

in  the  case  of  the  multipath  processor  this  improvement  ceased  when 

the  working  bandwidth  decreased  below  a  critical  frequency  of  X /cfT 

2  1/2 

the  critical  frequency  for  (A. 4-5)  is  roughly  l/[2{c  -  R  (r  )}] 

xa  tc  o 

For  incoherent  channels  (R  (x  )  -  0)  the  critical  frequency  is  about  the 

TC  O 

same  in  both  cases.  On  the  other  hand,  for  perfectly  correlated  channels 
2 

(R  (x  )  “  a  )  the  critical  frequency  becomes  infinite.  For  the  model 
considered  this  behavior  might  therefore  depend  on  the  steering  delay, too. 
Only  amplitude  fluctuations  contribute  to  the  level  of  the  plateau  of 
uncertainty  when  operating  below  the  critical  frequency.  However,  the 
other  terms  of  (4.4-4)  generally  increase  with  decreasing  frequency 
so  that  the  higher  critical  frequency  available  for  partially  correlated 
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channels  makes  array  processing  attractive 


Similar  statements  apply  to  the  covariance  for  the  tracking  error. 
In  order  to  shorten  the  expressions  that  arise  we  write 
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(4.4-6) 


and  we  contract  the  notation  aj^(v»T0»T0)to  Ignoring  the 

derivatives  of  very  slowly  varying  terms  we  obtain 
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The  uncertainty  plateau  in  this  case  becomes 


r1*!  <v°-v 


RAc  (to) 


[  1  +  2{oxa  -  Rtc(t0))  ^fxl 


[  1  +  4{o2  -  R  (t  )}  ft2  ]3/2 
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In  this  case,  however,  not  only  does  the  critical  frequency 
rise  with  increasing  coherence  between  the  two  channels,  but  the 
magnitude  of  the  plateau  below  the  critical  frequency  decreases. 
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In  the  limit  that  K  (t  )  ■  o  it  is  clear  that  the  amplitude 
fluctuations  have  no  effect  on  the  plateau  which,  in  fact,  disappears. 
The  other  terms  in  (4.4-7)  do  still  contribute  uncertainty  for  small 
T  and  are  affected  by  amplitude  fluctuations  unless  the  background 
noise  terms  dominate. 

Some  of  the  effects  discussed  in  this  chapter  are  peculiar 
to  the  model  of  frequency  insensitive  amplitude  and  delay  fluctuations. 
In  particular,  the  bahavior  of  the  various  plateaus  of  uncertainty 
as  a  function  of  frequency  is  very  simple.  However,  the  root  cause 
for  the  existence  of  these  persistent  levels  of  uncertainty 
is  the  non-gaussian  nature  of  the  scattered  signals,  and  thi6  is  not  a 
unique  property  for  this  simple  example.  While  it  is  generally 
necessary  to  compute  all  of  the  cumulants  for  a  non-gaussian  density 
in  order  to  specify  it,  some  insight  is  gained  by  examining  only  the 
fourth  order  cumulant  (v,t,t')  which  can  easily  be  computed 

from  (3.7-8)  and  the  moments  computed  in  Appendix  D.  The  result  for 
some  typical  values  of  the  relevant  parameters  is  shown  in  Figure 
(4.4-1).  The  cumulant  exhibits  a  plateau  of  dependence  over  a  6pan 
of  time  identical  to  the  settling  time  of  the  various  correlators. 
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Log  k  (r.r.i/)  vs.  Log  v  FOR  VARIOUS 


0001 


Log  k  (  t  ,t  yv  )  vs.  Log  v  FOR  VARIOUS 


CHAPTER  5 


THE  RANDOMIZED  SINUSOIDAL 
SURFACE  MODEL 


5.0  Introduction 

Having  obtained  variance  expressions  for  evaluating  the 
performance  of  various  detectors  and  trackers  using  the  random 
amplitude  and  delay  model  we  now  consider  the  problem  of  scatter- 
lng  from  a  sinusoidal  boundary.  In  this  case  a  receiver  is  excit¬ 
ed  by  reradlatlon  from  an  illuminated  area  on  a  surface  of  the 
form 

C(x,y,t)  *  h(t)  Sin[q  x  cos  a  +  q  y  Sin  a  -fit-  x(01  (5.0.1) 

8  S  8  8  8 

where  h(t)  and  x(t)  are  random  wavehelght  and  positional  phase 
parameters  and  are  considered  to  be  very  slowly  varying.  The  parameters 
q  ,  o  .  and  fi  are  the  magnitude  and  orientation  of  the  propagation 

S  S  o 

vector  and  temporal  frequency  for  the  surface  respectively.  This 

74 

model  was  investigated  by  Gulin  for  fixed  h  and  x* 

The  primary  advantage  of  this  model  is  that  with  certain 
simplifying  assumptions  the  space  integrals  of  equation  (1.4-3) 
can  be  performed  yielding  a  reliable  expression  for  H(u>,t)  for  low 
to  moderate  values  of  the  Rayleigh  parameter  (1.0-2).  This  means 
that  the  results  are  most  relevant  for  the  frequency  range  of  passive 
detection.  The  various  integrals  over  frequency  required  for  the 
computation  of  various  moments  of  the  received  signals  even  with 
this  simplification  are  still  very  difficult.  Results  are  presented 
here  in  the  form  of  series  which  converge  with  reasonable  speed. 
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5.1  Gulin *8  Solution  for  H(m,t)  and  the 
Associated  Impulse  Response 

By  substituting  the  equation  for  the  boundary  C(x,y,t)  into 
(1.4-13)  Gulin  obtained  the  following  integral  for  H(u>,t): 


H(w, t) 


jo)  sin  $  ^  ^ro+ro^  " 

2  c  r  r*  *  ] 

0  0  L 


^n(  2  h(t-ro/c)  Sin  j 


x  exp[tjnfls(t-ro/c)+jnx(t-ro/c)] 


(5.1-1) 


I  J  Be(0 ,4»)  exp{jw  ^  x^slnjj;  j  +  £ 


-jnq(x  cos  a  +  y  sin  a)}  dx  dy 


The  space  Integral  is  generally  not  executable  unless  simplifying 
assumptions  are  made  about  the  beam  pattern.  Consider  the  case  of 
an  omnidirectional  receiver  for  which  B  (0,$)  1.  The  conditions 

necessary  for  the  expansions  of  Appendix  C  in  this  limit  are  only 
marginally  satisfied  and  succeed  primarily  because  of  the  localized 
nature  of  the  active  scattering  region  as  stated  in  section  1.4. 
However,  this  approximation  makes  it  possible  to  perform  the  integral 
in  closed  form  so  that  we  have 


-lx  u  » 
J  6 


H(w,t) 


-e 


r  +  r ' 
o  o 


I 

nc-“ 


[A(t)u>]  e 


+jBn 


u> 


(5.1-2) 


x  exp[+jnft  (t  -  r  /c)+Jnx(t  -  r  / c)] 

BO  O 


where 


A(t)  =  2  h(t')  sin  jj 


(5.1-3) 


t*  ■  t-r  /c 
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(5.1-4) 


B  - 


cR 
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2 

cos _ a  ,  eln' 

2  + 
sin  ^ 


'■] 


and  r  +  r' 

o  o 

Ts  "  c  (5.1-5) 

If  the  wavehelght  h(t)  and  phase  x(t)  are  slowly  varying  then 

it  can  be  seen  from  (5.1-2)  that  the  time  variation  of  the  frequency 

5° 

reponse  contains  major  sidebands  at  frequencies  nft  .  Cassedy 

8 

recognized  that  each  sideband  corresponds  to  a  single  Bragg  order 

of  scattering.  In  practice,  at  any  given  frequency  the  beamwidth 

limits  the  number  of  excited  orders  that  are  received.  Furthermore, 

higher  orders  are  excited  at  higher  frequencies  so  that  for  any 

fixed  bandwidth  only  the  first  few  orders  need  be  retained  in  (5.1-6). 

For  the  applications  of  this  chapter  we  assume  that  the  model  will 

be  bandwidth  rather  than  beamwidth  limited. 

Additional  insight  is  gained  concerning  this  model  if  we 

examine  the  impulse  response  corresponding  to  (5.1-6).  Unfortunately, 

the  first  power  dependence  on  frequency  in  the  exponent  again 

prevents  us  from  obtaining  simple  closed  form  expressions.  However, 

h(i,t)  can  usefully  be  regarded  as  a  sum  of  convolutions  as  follows: 

00  00 

h(T,t)  - - ^ -  l  [  j  £"(T-P)£?(p)dp  (5.1-6) 

(r  +  r’)  n“-°° 

o  o 

x  exp{+jnfl  (t  -  r  /c)  +jnx(t  -  r  /c)}  ]  t'n 

b  U  U 
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where  l*  is  1  for  even  and  j  for  n  odd  and  where  from  Uatson 
(p. 405,660.13,42, 04) : 

00  jw(t-T  ) 

f?(t)  -  J  e  J  [A(t)w)  dm 

1  i.  n  2^ 
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Cos{n  8in”*[(t-T  )/A(t) J} 

8 

|t-Tg|<A(t) 

n  even 

/  A2(t)  -  t2 

.  < 

0 

|t-Ts|>A(t) 

(5.1-9) 

Sln{n  Sin  ^[(t-T  )/A(t)} 

s 

|t-T8|<A(t) 

n  c  id 

/a2(0  -  t2 

0 

|t-T  | >A(t) 

and  from  Erdeli  ^  (Vol.l,  p.244, #31) 

»  Jwt+jBn2 

f9(t)  **  e  w  dm 

l 

e 

(6(t)  +  ft  J.(2n/Bt  ) 

l  /  »  1 

t  »  0 

(5.1-8) 

0 

t  <  0 

These 

two 

function?  are  sketched  in  figures 

(5. 1-la, b).  The 

width 

of  f”(t)  is  2A(t)  or  4  h(t')  Sin  i p  /c.  This  is  the  difference  in 
travel  time  between  rays  reflected  from  imaginary  planes  tangent  to 
the  upper  and  lower  peaks  of  the  sinusoidal  boundary.  The  singular¬ 
ities  at  the  extremities  of  this  interval  suggest  the  importance 
of  the  points  of  inflection  of  the  surface.  Due  t  he  <5  -  function 
in  (5.1-8)  these  singularities  appear  as  the  result  of  the  convolu¬ 
tions  in  (5.1-6).  The  Bessel  function  in  (5.1-8)  becomes  more 
oscillatory  with  increasing  n  and  Rg.  In  this  limit  the  curvature  de¬ 
viation  of  the  incoming  wavefront  from  planar  is  negligible  over  the 
active  scattering  region.  For  large  B/A  its  effect  is  to  introduce 
very  high  frequency  oscillations  which  may  actually  lie  outside  the 
processing  bandwidth  in  which  case,  they  may  be  ignored.  Figure 
(5.1-2)  illustrates  this  effect. 
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Fig.  5.1-lb 
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5.2  First  and  Second  Order  Statistics 
For  the  Sinusoidal  Boundary  Model 

We  assume  that  the  phase  x(0  is  a  sum  of  two  independent 
random  variables: 

X(t)  -  X0  +  Xx(t)  (5.2-1) 

where  xo  is  a  random  Initial  phase  which  uniformly  distributed 
between  [-n,ir]  and  x^(t)  is  a  stationary ,  zero  mean  Gaussian  process. 
The  waveheight  h(t)  Is  also  taken  to  be  a  stationary,  zero  mean 
Gaussian  process  so  that  the  coherent  frequency  response  becomes 
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The  phase  x(t)  and  waveheight  h(t)  are  assumed  independent.  Since 
computationally  this  result  implicitly  contains  series 
representations  for  the  exponential  and  Bessel  functions  it  is 
worthwhile  to  consider  an  alternative  scheme  for  performing  the 
integral  in  (5.2-2).  By  formally  writing  down  the  series  for 
an  exponential  and  Bessel  function  we  have  for  *sp  -  1  *  0 

J  *5p-x(*)  "  exp(-z2/2p)  x 


(5.2-3) 


n 


r(^P) 
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This  series  converges  very  rapidly  for  values  of  z  up  to  about 
2  or  3  which  Is  sufficient  for  applications  involving  weak  or 
moderate  scattering.  In  fact,  for  some  applications  one  may 
obtain  qualitatively  accurate  results  for  small  z  by  stopping 
with  the  first  term.  On  rearranging  the  series  we  obtain 


and  where 


(a)fc  -  r(a+b)/r(a) 


(5.2-6) 


In  particular,  for  J  (z)  we  have  p  »  2.  Hence,  (5.2-2)  becomes 

o  • 


-w 


H  («)« 


-  e 


(r  +  r') 
o  o 


/  1  +  2w2ob  Sin2\J> 


(5.2-7) 


]  *  (2m)  l  A(m 

^  2in+l  .  t  v  2 
m»o  (ml) 


2u2o2  Sin2^  ^  m 
-  J 


2  2  2  - 
T  1  +  2u>  0*  Sin  p 

L  c2 


m 


This  result  converges  very  rapidly  for  small  values  of  the  Rayleigh 
parameter  ko^  Sin  while  for  large  values  the  frequency  behavior 
is  still  primarily  determined  by  the  first  term.  Furthermore,  when 
written  in  this  form  the  result  is  easier  to  integrate  in  various 
applications.  For  example,  the  coherent  impulse  response  corresponding 
to  (5.2-7)  is  given  by 
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where  again  the  first  term  is  the  most  important  one  for  low 
frequency  applications . 

The  same  philosophy  must  be  applied  in  obtaining  the  second 
order  statistics  since  without  this  technique  it  is  not  possible 
to  perform  the  appropriate  averages.  Hence  we  have 
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where  Che  atationarity  of  h(t)  and  x^(t)  has  bean  employed  and  where 
(n.n'.t)  Is  Che  characCerlatlc  funcCion  for  the  random  variables 
X^C)  and  XjCt'f).  Ic  should  be  emphasized  again  ChaC  only  Che 
Cerms  for  low  values  of  \  and  m  need  be  reCalned  for  low  frequency 
applicaClons.  The  cr  are  2  for  n  i  0  and  1  for  n«0.  . 

Thj  remaining  average  over  h(c)  and  h(t-i)  becomes 
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where  o.  and  r,  (x)  are  Che  variance  and  correlaCion  funcCion 
h  h 

respecCively  for  h(c) .  This  average  can  be  Cransformed  as  follows: 
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the  modified  Gaussian  distribution  shown  and  where 
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(5.2-13) 
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Thus,  (5.2-9)  becomes 
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It  should  be  noted  that  both  D'  and  y!  .  (n)  are  functions  of 

n,\,m  j,K 

t.  The  most  natural  scheme  for  truncating  these  series  is  to  retain 
terms  only  up  to  a  fixed  value  for  2(n+i+n)  since  this  sets  the  order 
of  the  approximation  in  terms  of  frequency  times  waveheight. 
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Writing  down  all  moments  up  to  order  4  ve  have 
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It  can  be  seen  that  all  cf  the  moments 
written  in  the  following  form: 


%+2  l,n+2m 


(n)  can  be 
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where  the  Pr  ^  m(w,<i)f,T)  are  polynomials  in  w  and  w'  of  order 
2(n+\-vui)  in  frequency  and  containing  only  even  powers  of  u  and 
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We  write  them  in  terns  of  frequency  normalised  by  the  Rayleigh 
parameter: 
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P_  .  (o,U)\t)  -  \ 

n,i,m  / 


p.q-o 


2  2  2  r 

'  /„  n  T\/  2w  o.  Sin  ^  / 

1  (n+l)c 


2w'^o^  Sin2iJ<  Iq 
(n+l)c2 


The  first  several  of  the  coefficients  G  (p»q»T)  have  been 

It  f  \  f  01 

tabulated  in  Appendix  G.  Using  (5.2-20),  (5.2-16)  and  (5.2-12) 
we  have 


Dn,i,mM  n+2i  ,n+2m^ 


{(n+l)[l-p2(i)]}  ,+n+“ 


n+i+m 

p,q-0 


2(p+\)+n  ,2(q-Hn)+n 
^n 


[1-p2(t)]  +  [e2+C2)  ♦  1)'+n+1,rt45  (5.2-21) 

n  n  n  n  n 

0 

where  we  define  the  normalized  auto-covariance  function  for  h(t): 

r.(x) 

Ph(t)  -  -iy-  (5.2-22) 

o 

and 

.  /  2  qh  8to>  „  *  f  .  /T  %  Sln  *  „•  (5.2-23) 

^n  /  n+1  c  n  /  n+1  c 

are  the  normalized  frequencies.  As  a  practical  matter,  the  fourth 

2  2  2 

order  dependence  on  frequency  for  the  term  £  £’  (1-p.  (t)]  creates 

n 

certain  difficulties  when  performing  various  integrals  over  frequency. 
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In  order  to  circumvent  these  difficulties,  we  rewrite  the  denominator 
of  (5.2-21)  as  follows: 


_ _ _1 _ 

UnCn2[1'ph(T)1  +  +  1 } 

_ 1 _ 

(  -«n«n2Ph(T)  +  2+D  (5^+1) 


(5.2-24) 


_ _ 1 

l  (  ^+1)  (^2+l) 


727T72772  n+\+m+^ 

ph(T)  gn*n  ' 

(5n+1)  an2+1)  ‘ 


Next  we  note  that 

0  <  p?(t)  ?V2  <  1 

_ n _ n  n 

U2+i)a,2+i) 

n  n  7 


(5.2-25) 


and  for  the  large  t  or  for  small  £  and  £'  the  lower  boundary  is 

80 

approached.  Using  the  result  that 


(l~z)“a  -  1FQ(a; ;z) 

we  have  that  (5.2-25)  becomes 

_ 1 _ 

(52e'2u-p2(T)i  +  kV2)  +  i)n+l+“+,s 
n  n  n  n  n 

=  \  r(\+m+n+r+*s)  1  ^ph  T  ^n^n  _ 

r*=0  T(  +n+m+M)  r!  U2+l)  U '  2+l)  ]n’H+mfr+J* 

n  n 

Of  all  the  expansions  made  thus  far  in  the  analysis  this  last  one 
converges  least  rapidly.  However,  convergence  is  worst  for  the  case  of 
t  «  0,  and  then  only  for  greater  than  1.  In  this  case  alternative 


(5.2-26'. 


(5.2-27) 
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procedures  can  be  used  to  perform  the  various  Integrals  that  arise. 
For  example,  by  writing 


uV211-02(t)]  +  C2  +  5'2  +1) 

n  n  n  n  n 


n+i+m+k 


{1  2  ,2)l-hn+x.+>5 

n  n 


_ 1 _ 

(1  +?2+  V2) 

n  n 


i+mfn+*5 


r*o 


r2  ,2ri  2 (  ..,r 

r(i-Hnfn-fr4-^)  1  {"Vn  L1“phu;j* 

j.  ...  •  .  v2.  _  ,2.  i+m+n+r+^ 

r(i+m4n4^s)  r!  (l+£  +£'  ) 

n  n 


(5.2-28) 


The  convergence  of  this  series  is  more  rapid  for  i  near  0  and  ££' 
greater  than  unity.  Unfortunately,  this  representation*  fails  to 
converge  at  all  for  frequencies  too  high  for  the  condition 


Cn  5n2 
n  n _ h 

a  +e2  +  k'2) 

n  n 


<  1 


(5.2-29) 


to  be  satisfied.  Therefore,  in  order  to  avoid  questions  of  con¬ 
vergence  for  integrals  of  these  functions  over  frequency  we  will 
use  (5.2-27)  throughout  the  rest  of  the  analysis.  Computationally- 
(5.2-28)  still  offers  advantages  for  certain  applications.  However,  m 
incidental  advantage  of  (5.2-27)  is  that  it  is  a  sum  of  terms  which 
are  factored  in  such  a  way  that  multiple  integrals  over  £  and  £’  can 
be  performed  with  greater  ease. 

It  should  be  noted  that  the  density  (5.2-11)  is  singular  for 
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t  -  o  as  are  the  coefficients  G  ^(ptq,!),  but  the  product 

n  1 1 1 in 

[l-p^(T)]lH‘l+“Gn  ^  m(p,q,T>  which  arises  in  (5.2-21)  remains  finite. 
Substituting  (5.2-27)  and  (5.2-21)  into  (5.2-17)  we  obtain 

*(u),ujJ  t) 


e-J(u)-to,)Tg 

(r  +r’)2 
o  o 


2 

r  1  exp(1Bn  ,  1 

1  wn  I; 

n»o 


(5.2-29) 

Q  (-n.n.x)e  Cos  nfl  x 
*  *  n  s 


OO  00 


L  L  r/  ,  ,  1+tn 
i«o  m=o  [4(n+l) 


n+i+m 

•0 

A(\  ,2n+l)  A(m,2n+1)  \  r  (\4xi-hn-hr-f**  1  \ 

i!  m!  ^  r(i+n+nH4')  r!  * 

r-o  p.q*o 


{ (n+1) [  1-p? (t)  1 } l+n+mQ  n(p,q,T)[p,2(T)]r  c2(r+P+i)+nt-2(^)^ 

n  n,  \  ,m  n  n  n 

[(e2  +  i)«'2  +  i)]n+>+®+*+,» 

n  n 

for  the  inter frequency  system  correlation  function,  and  for  lomparison 
with  (4.1-14)  we  have  the  tri-frequency  spectral  density: 

' )  «  (5.2-30) 

d0)M 


00  00 


e  -](u-m’)ts 

(r+r ')2 
O  0 


o 

1  exp {j Bn  I  1  -  li}  z  \  \  A(i  ,2n-H)A(m,2n+l) 


V  ± expijon  /  i  -  i \  \  Ali.z 

l  (n!)2  1“  "Z  L~7> 

n=-o  i  bq  id*»o 


m! 


]  Ill 
L-  r( 


+n+nrt-r+^) 


r=o 
where 


+n+m+*$) 


rT  l 


S  (<.”)  e^<rfP+')+n^2(r+q-Hn)+n 

n,i,m,p,q'  - 


p,q*o 


[(c.2+i)(C2+i)]n+,+mtr+>i 


S  (to’’) 

n,i,m,p,q,r 


(5.2-31) 


J  e  1  Qx  (-n,n,T){(n+l)[l-p^(T)]}1+n+m{p^(x)}r  Gn>l  >m(p»q ,T)Cosnflgi 


is  the  spectrum  of  the  slow  variations  of  the  scattering.  When  the 


dx 
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time  variations  of  the  surface  positional  phase  x^(t)  and  vaveheight 

h(t)  are  slow  compared  to  the  mean  translational  frequency  0  then 

8 

the  presence  of  the  term  Cos  nflfl  t  in  (5.2-32)  produces  sidebands  in 

S  (w")  which  is  not  the  case  in  (A.  1-15).  Since  the  £ 

n,\,m,p,q,r  n 

decrease  with  increasing  n  for  fixed  u  and  since  £n  ■  1,  higher  order 
sideband  are  excited  at  higher  values  of  w. 

Similar  comments  apply  to  the  smearing  function 

dC(VtV)  (5.2-32) 

dy 


/  jl  r  £n  r  r 

(r+r*)2  *  (n!)2  *  * 

o  o  n*o  i*o  m* 


00 


n+i+m 


A(\  ,2n+l)A(m,2n+l)  ^  r ( x-Hn-fn-f r+^)  1 


i! 


m; 


l  r(\+ra+n+r)  rJ  \ 
r*o  *  p,q=o 


S  W)W)2<”*'***>* 

n,t,m,p,q,r  n 

,  v  Ar+2(n+i+m+p+q)  f  ,  2(n+i+m)+*j„  f  ...  - 

x(fT>  Vn  K2(n+>im)-Mi(vn)] 


n 


where  the  scaled  delay  spread  parameter  is  given  by 


/  n+1  _ c 

2  Sin  i|/ 


That  is,  the  more  intricate  delay  distortions  fade  more  rapidly.  Thus, 
while  this  model  is  more  complex  than  the  random  amplitude  and 
delay  model  of  Chapter  A  because  of  the  detailed  nature  of  H(u),t), 
many  of  the  general  properties  of  the  two  scattering  models  are 
the  same. 
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5.3  Multipath  Correlator  Fluctuations 


for  the  Random  Sinusoidal  Boundary 

The  spectra  (A. 2-2)  and  filters  (A. 2-3)  were  chosen  to  be 

Gaussian  shaped  In  order  to  simplify  the  various  Integrals  over 

frequency  that  arise  In  the  analysis  of  the  random  amplitude  and 

delay  model.  However,  the  Gaussian  shape  is  not  suitable  for  use 

with  expression  (5.2-29)  for  4>(w,u),,t)  in  computing  the  normalized 

2  2  h 

covariances  (A. 2-1).  This  is  primarily  due  to  the  factor  l(£n+l)(£^  +1)1 
in  the  denominator  of  each  of  the  terms  in  (5.2-29).  The  presence  of 
these  branch  point  singularities  inevitably  leads  to  Bessel  functions 
or  other  functions  which  are  poorly  tabulated  or  otherwise  unfamilar. 

In  order  to  circumvent  these  difficulties  we  assume  in  this 
case  that  the  spectra  are  of  exponential  form: 

a)  S  M  -  HJx  (5.3-1) 

xx  „ 

X 


b)  S  (w)  -  S 
nlnl 


n2n2 


r  \  HP  - 1  (i)  I  /fln 
(u>)  -  _ n  e  1  1  n 


n 


These  spectra  have  roughly  the  same  power  in  the  band  from  0  to 
u  as  the  spectra  in  (A. 2-2)  although  they  tend  to  zero  less 
rapidly  for  large  w.  Similarly,  we  take  the  filters  as 


H^w)  ■  H2(u>)  ■  e  ^M^f 


(5.3-2) 


The  direct  path  transfer  function  is  given  by  (3.1-1).  Using 
equation  (3.A-5)  and  the  expression  for  the  coherent  response  Hc(io) . 


# 
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given  in  (5.2~2)  we  have 


s(t,T,p) 


I  e^^'V  «  Px  e-M(i-  ♦  i_  ) 

“ft -  f  X 


E  -JL  -C2n)!  A(n,2) 
n*0  2n+1  (n»)2 


K /«  * 


.2n 


D  +  r]  n+% 


dw  s 


(5.3-3) 


n=0  2 "+1  fvp-  RE{  /  e‘?oCo  C“fx-i(T-0)] 


f2" 


Kj 


dC 


n+% 


o} 


and  where 


stands  for  the  "real  part  of" 

the  enclosed  quantity. 

a_  t*  -1—  s  -  *  +  1 

fX  °fx  °f  \ 

(5.3-4) 

cn  *»/  £ii  c 

2  ah  Sin  \Jj 

(5.3-5) 

T  S  T  -  T 

(5.3-6) 

o  s  d 

Now  the  integral  over  normalized  frequency  arises  several  times  in  the 
analysis  ef  this  model  for  the  scattering  system  and  so  it  is  worth 

examining  in  some  detail.  Using  the  contour  integral  representation 

given  by  Slater  (p,  25  fi  i  r  . .  - 

P  >  .6.1.6)  for  the  hypergeometric  function 

(5.2-26)  we  have  that  for  a  >  0 

J00 

(1+0  3  srW  j.  r(a+s>  r(-s)  (z)s  ^ 

2tt  J 


(5.3-7) 


^here  the  contour  implied  must  pass  from  -j»  to  between  -a  and  0. 
Using  this  the  integral  in  (5.3-3)  becomes 
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00 


!J 


p  -uz  ,,  2. -a  . 

e  (1  ♦  u  )  du 


[  rjM 1  r(-s)  (  e”zu  up*2s  du  ds 


-y 

j" 


2tj  s 


f  r(p+2s+l)  r(a+s)  _N  ds  _ 

J  :jhs~-  tkt  (‘s) 


-  TOO 


j» 

P+1  t  A  S 


1  7  p  1  (  A  5 

nrw  (f)  J  (-5-)  rP«>+s*)»5  rftp+s+i)  rca+s)  r(-s) 

,j.  1 


ds 

2nJ 


e  2n  r(a) 


/2^p+1  r13  /  4  \hp+h,hp+l,*) 
llJ  °31  l“2  0  > 


where  the  contour  is  further  restricted  to  pass  from  -j»  to  +j»  between 

0  and  the  larger  of  -*ip-*s  and  -a.  The  GpP(x)  function  is  called  a 

83 

Meijer  function  and  the  integral  defining  it  always  converges 

provided  RE(z)>0.  The  contour  is  normally  closed  in  the  left  half 

plane  for  the  purpose  of  evaluating  the  integral  although  an  asymptotic 

82 

representation  can  be  obtained  by  closing  in  the  right  half  plane 
Since  for  the  applications  of  interest  here  the  quantity  a  is  either  an 
integer  or  a  half  integer,  some  of  the  poles  of  T(a+s)  overlay  the 
poles  of  r(Jjp+^+s)  or  r(^p+l+s)  in  the  integrand  of  the  defining  integral 
thereby  making  them  second  order.  By  a  translation  and  reflection 
change  of  variables  we  can  rewrite  (5.3-8)  in  a  more  compact  form: 
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(5.3-9) 


00 

|  UP  e-zu  (1  ♦  u2)”a  du  ■ 
o 


j00 

JTTJaT  I  r(‘s)  r(,5"s)  r(a‘p“*1-s)  rCP+^+s)  2nf" 


"  IFTTiJ 


) 

,Jj,a-*sp-V 


Another  obvious  property  of  the  Maijer  function  notation  is  the 
simplicity  of  differentiation: 


G 


31  2 


h-hp  ) 
0,h,a-hp-hJ 


(  i _ 2 

(-D  Glj(z 


Ji-Jj(p+ni)  \ 
0,*i,a-*s(p+m)-V 


(5.3-10) 


Although  the  Meijer  function  serves  as  a  convenient  identification  of 
the  contour  integral,  it  tells  us  nothing  about  how  to  compute  its 
values  for  various  z.  Computation  schemes  are  discussed  in  Appendix  I 
where  series  representations  are  presented. 

Nevertheless,  using  this  notation  we  can  express  the  mean  of  the 
output  for  the  multipath  correlator  using  (5.3-3)  and  (5.3-9): 


S(t,T,P) 


-c  p  /n 
o  x  x 

W1? 


z 

n=0 


(2n) ! 

(n!)2 


1 

2tt  rMs) 


(5.3-11) 


Roughly  speaking,  the  peak  of  correlation  which  occurs  at  t  =  ^as 
width  which  is  on  the  order  of  1/c^  for  very  wide  bandwidths 
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(small  «£X) •  For  very  small  bandwidths  the  peak  width  approaches  a ^ 
as  it  should. 

The  integrals  1^  (k*l  to  4)  which  arise  in  the  expression  (3.3-8) 
for  the  variance  of  the  output  of  the  correlator  have  been  summarized 
in  Appendix  H.  Their  derivation  is  straightforward  because  all  multiple 
integrals  that  occur  can  be  written  as  sums  of  iterated  integrals  of 
the  form  given  in  (5.3-9).  It  is  unnecessary  to  write  down  the  full 
expression  for  the  variance  since  the  presentation  of  the  1^  in  Appendix 
H  is  already  in  a  format  suitable  for  numerical  computation  provided  a 
basic  program  is  constructed  to  implement  equation  (3.3-8). 

Instead,  we  concentrate  on  examining  the  plateau  of  variance  for 
this  model  corresponding  to  (4.2-11)  for  the  random  amplitude  and  delay 
model.  From  chapter  4  we  know  that  this  plateau  is  given  by 


Lim  WV0* 


_  .  Lim 

tt*°  [.h(t0-,t,pT'J2  ' 


p  /n 

X  x 


[H(t0,T,p)j2R2(Vr-)2  n=0  >_ 


+«  _ 

«  t  r  c  2  CO  to 

r  (^r)  Qv  c-n,n,u)  e  Cos(n!l  o)  L  l 

=0  1  _  >  "•  X1  n  s  1=0  m= 


[4(n+l)] 


1 _  A(l,2n+1)  A(m,2n+1)  1 _ 

^-|l+m  1!  m!  r(l+m+n+*s)  r(l+m+n+r+h)  r! 

(5.3-12) 


,  Un+l)[l-p2(u)])1+mtnGn>l  raCp,q,0)} 

p,q=0 


^-^n-r-p-l 

0,^,n+m+p 


)  dy 


^+^n-r-q-m’\  1 
0,Ji,n+l-q  '  * 


fjfu')  dii' 


1 
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where  the  f^J(p)  are  defined  by  (5.1-8).  As  mentioned  in  connection 
with  that  definition,  the  convolutions  in  (5.3-12)  become  less 
important  when  the  receiver  and  source  are  at  great  distance  from  the 
active  scattering  region  provided  that  the  grazing  angle  is  not  too 
small. 

The  plateau  (5.3-12)  is  plotted  versus  /2a^  Sin  /c  in  figure  (5.3-1) 
for  the  limit  for  which  the  convolutions  can  be  ignored.  It  can  be 
seen  that  nimilar  comments  can  be  made  about  the  behavior  of  the 


plateau  for  this  model  above  and  below  the  critical  frequency  (  at 

roughly  n  cq)  as  were  made  for  the  simpler  model  of  chapter  4. 

2 

The  plateau  is  not  dependent  on  the  variance  of  the  positional 
phase  fluctuations  of  the  surface  since  the  characteristic  function 


Qxl(-n,n,u)  is  unity  for  u  *»  0.  We  note,  however,  that  if  Xj(t)  is 
gaussian  we  have 


Qxl(-n,n,u)  =  expf-oxln2|l-pxl(u)j  > 


(5.3-13) 


.[Qxl(-l,l,u)]n2 


so  that  the  rate  of  decay  of  the  plateau  of  uncertainty  with  increasing 

2 

o  is  enhanced  when  axl  is  large.  This  is  due  to  the  improved  averaging 
over  the  surface  fluctuations.  The  dependence  of  the  critical 
frequency  only  on  the  Ruyleigh  parameter  suggests  that  the  variable 
h(t)  is  analogous  to  a  random  delay.  Again  one  must  conclude  that  only 
the  coherent  signal  energy  contains  useful  information  if  we  use  the 
"on-target"  normalized  correlator  output  variance  as  the  critereon  of 
performance.  This  must  be  qualified  as  in  chapter  4  by  the  comments 
made  in  Appendix  D. 
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Just  as  in  chapter  4  we  may  make  a  more  confident  evaluation  of 
the  tracker  performance.  Using  (5.3-11)  and  (5.3-10)  we  have 


p(T,  T,  p)] 


c3  p  /n 

0  X  X 

W1? 


I 

nO 


(2n) ! 
(n!)2 


0, 


h-vi-1 

>1.  -1)1 


(5.3-14) 


which  is  the  quantity  in  the  denominator  of  (3.6-3).  The  appropriate 
integrals  for  the  evaluation  of  the  covariance  of  the  tracking  error, 
RtoCvj)  ar£:  also  presented  in  Appendix  H.  We  write  here  only  the 
plateau  of  uncertainty: 
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x  x 
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1=0  t-^T! 


1! 


ml 
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r(l+ni+n+%)  r(l+m+n+r+^)  r! 
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n+l+m 

£  {(n+l)[l-p^(u)]  G  .  (p,q,y)  {p‘(o)> 

p,q.-o  n  n,i,m  n 


2 

The  quantity  RT0(°)/C0  is  plotted  in  figure  (5.3-2)  versus  fifx/c0 
and  can  be  seen  to  be  essentially  the  same  as  figure  (5.3-1). 
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Normalized  Variance  Plateau  vs.  Working  Bandwidth 


* 


Normalized  Delay  Estimate  Error  vs.  Working  Bandwidth 
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5.4 


Second  and  Fourth  Order  Cross  System  Statistics 


for  the  Two  Channel  Random  Sinusoidal  Bound 


In  attempting  to  generalize  the  single  channel  transfer  function 
model  of  (5.1-2)  we  must  take  into  account  the  fact  that  each  receiver 
"sees"  different  (although  possibly  overlapping)  active  regions  on  the 
surface.  Furthermore,  the  parameters  describing  the  surface  such  as 
the  local  waveheight,  positional  phase,  or  the  orientation  angle  a  may 
be  different  in  these  regions.  In  this  section  we  consider  a  model  in 
which  the  local  waveheights  h^t),  hj(t)  and  positional  phases  Xj(t) 
and  X2U)  are  dependent  on  the  relative  location  of  the  specular 
points  for  the  two  receivers.  The  distances  from  the  source  to  the 
receivers  is  considered  large  compared  to  the  separation  between  the 
two  receivers. 


If  the  source  is  located  at  Q(x  ,y  ,z  )  and  if  the  two  receivers 

^ 

are  at  PiCxpl,ypl,zpl)  and  P2(xp2»Zp2^  then  the  sPecular  Points 
Oi(xoi,yoi  0)  and  °2(x02,yo2,°^  have  coordinates  given  by 


(5. 4- la) 


z  .y  +y  .z 


z  +  z 

q  pj 


(5.4-lb) 


The  horizontal  distance  between  these  two  points  is  thus, 

J  (xo2-xol)2  +  (y02-y0l)Z. 

This  displacement  alone  would  account  for  a  positional  phase 
difference  of 

*s  *  qs{(xo2  ■  V5  Cos  “s  +  (yo2  •  yoP  Sin  V  (5,4'2) 
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However,  we  consider  here  a  slightly  more  general  model  in  which  the 
waveheights  h^(t) ,  h2 (t)  and  positional  phases  Xj(t),  X2CO  "seen"  by 
the  two  receivers  are  loosely  coupled  random  variables.  In  order  to 
account  for  the  fixed  positional  phase  displacement  (5.4-2)  the  "initial" 
phases  for  the  two  surface  areas  are  taken  as  x  and  x  ♦  X 

O  O  5 

respectively  with  xQ  uniformly  distributed  over  the  interval  -rr , tt  . 

The  two  channels  therefore  become 


H.(w,t)  «  -e  siW  Z  J  [A.(t-r  ./c)u>~]  e"^Bin  ^ 


X  exp(+jnfls(t-  roi/c)  +jnxi(t-roi/c)+jn{x0+(i-l)xs>] 


(5.4-3) 


for  i*l,2 

where  rQl  and  rQ2  are  the  respective  distances  of  the  two  receivers 
from  their  specular  points  and  where 


A.(t) 


2h^(t)  Sin 


B.  = 

l 


q2c  R 
ns  ei 


rCos2o  , 

- R—--  +  Sin  o 

USin' U>.  - 


(5.4-4) 


(5.4-5) 


T  . 
SI 


(r  .+r' . )/c 
v  01  oi' 


2r  .r'. 


01  01 


r 


i 

oi 


(5.4-6) 

(5.4-7) 


We  make  the  usual  assumptions  about  the  stationarity  of  the  random 
parameters  h^(t)  and  x^(t).  We  further  assume  that  the  distances  of 
the  source  and  receivers  from  the  specular  points  is  great  enough  so 
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5.4  Second  and  Fourth  Order  Cross  System  Statistics 
for  the  Two  Channel  Random  Sinusoidal  Boundary 


In  attempting  to  generalize  the  single  channel  transfer  function 
model  of  (5.1-2)  we  must  take  into  account  the  fact  that  each  receiver 
"sees"  different  (although  possibly  overlapping)  active  regions  o»i  the 
surface.  Furthermore,  the  parameters  describing  the  surface  such  as 
the  local  waveheight,  positional  phase,  or  the  orientation  angle  a  may 
be  different  in  these  regions.  In  this  section  we  consider  a  model  in 
which  the  local  waveheights  hj(t),  hj(t)  and  positional  phases  Xj(t) 
and  x2(t)  are  dependent  on  the  relative  location  of  the  specular 
points  for  the  two  receivers.  The  distances  from  the  source  to  the 
receivers  is  considered  large  compared  to  the  _eparation  between  the 
two  receivers . 

If  the  source  is  located  at  Q(x  ,y  ,z  )  and  if  the  two  receivers 

R  ^ 

are  at  Pi(xpi>vpi»zpi)  and  P2CXp2* zp2^  then  the  sPecular  points 
Oi(xoi,yoi  0)  and  °2^xo2,yo2’0^  have  coorciinates  given  by 


03 


Z  .X  +x  .z 

pi  <t  pj  q 

z  +  z  . 

q  pj 


(5. 4- la) 


z  .y  +y  .z 

r  a  -P/fl  . 

oj  z  +  z  . 

q  pj 


(5. 4- lb) 


The  horizontal  distance  between  these  two  points  is  thus, 

J  Cxo2-xoi)2  +  ^02-yoP2. 

This  displacement  alone  would  account  for  a  positional  phase 
difference  of 


*s  *  qs((xo2  -  ’‘ol5  Cos  “s  *  (yo2  -  yoP  Sin  V  (5-4-2) 


ft 

<► 
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However,  we  consider  here  a  slightly  more  general  model  in  which  the 

waveheights  h^t),  h2(t)  and  positional  phases  XjCt),  X2CO  "seen"  by 

the  two  receivers  are  loosely  coupled  random  variables.  In  order  to 

account  for  the  fixed  positional  phase  displ  ;ement  (5.4-2)  the  "initial" 

phases  for  the  two  surface  areas  are  taken  as  x  end  x  ♦  X 

O  0  s 

respectively  with  xQ  uniformly  distributed  over  the  interval  . 

The  two  channels  therefore  become 


H^w.t)  «  -® 


-JT 


e  (I) 

si 


(roi+riiJ 


t 

n»-» 


Jn^i(t‘roi/c^  ^ 


(5.4-3) 

x  exp  [+jnfis (t-rQl/c) + jnXi (t-roi/c) + jn(xo* (i- 1) xs  >] 

for  i*=X,2 

where  rQl  and  rQ2  are  the  respective  distances  of  the  two  receivers 
from  their  specular  points  and  where 

2h. (t)  Sin  i|>. 

Ai(t)  =  — - c - 1  (5.4-4) 


B. 

l 


q2c  R 


ei 


T 


rCos2a  , 

- +  Sin  a 

LSin>.  - 


(5.4-5) 


T  . 
SI 


(r  .+r* . )/c 
01  oi' 


2r  .r'. 
01  01 

r.+r’ . 
01  01 


(5.4-6) 

(5.4-7) 


We  make  the  usual  assumptions  about  the  stationarity  of  the  random 
parameters  h^(t)  and  x^(t).  We  further  assume  that  the  distances  of 
the  source  and  receivers  from  the  specular  points  is  great  enough  so 
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\ 


that  3  if 2  **  ^d  that  the  dimensions  of  the  array  are  small  enough 

that  so  that  the  retardation  time  difference  (r^  -  r0j)/c  is  negligible 

on  the  time  scale  of  the  slow  time  variations  of  the  waveheight  and 

positional  phase  variations.  The  spherical  attenuation  factor 

l/(r  .+r'.)  is  replaced  by  1/ (r  +r')  where  r  and  r*  are  nominal  values 
01  oi  oo  o  o 

for  r  .  and  r ' . .  Thus , 
oi  oi 

*ik(w»w',T)  « 


e'^Tsiu-tsk<l1'^ 

(vri)2 


£ 

n=0 


(n!) 


.xp 


B, 


-  sr  )l  'Wn,n’T) 


(5.4-8) 


Cos[n(fi  t+xc)1  £ 
s  s  1=0 


£ 

m=0 


1 

Q4(n*l)] 


1+m 


r^+ni+n+r+^) 

r(l+m+n+^)r! 


n+l+m 

£ 

p,q=0 


t(n.l)  [l-p*  h  Ct)j}Untn 

X  iC 


Jn,l,m 


(p,q,T) 


{phihk(T)}r 


^2(r+p+l)+n  ^,2(r+q+m)+n 

[a*  *  i)  ce;2  *  13]  n+ltm+Tt!5 


where  we  have  assumed  that  the  two  channels  have  similar  statistical 
properties  (i.e.  o^  ■  »  o^).  The  waveheights  h.,(t)  are  taken  to 

be  independent  of  the  positional  phase  fluctuations  x^(t).  Again, 
asserting  these  parameters  to  be  Gaussian  random  variables  we  achieve 
fourth  order  cross  stationarity.  The  fourth  order  system  moment  is, 
however,  a  great  deal  more  complicated  owing  to  a  »ore  complex  cancellation 
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of  positional  phases: 


♦12(u,w,,u>M,w»y,»y")  - 


H^(u,t)  HjCw1 1  * ,t-w' ')*  - 


-jt8i((u-o)')-  JTi2(w' *-tor  •*) 

<rn+r^4 

0  0 


I  !  S  i 

J»0  £»0  g*0 


(5.4-9) 


Sin  ^  Jn2|2wi  Sin  ^ 

c  ]  I  c  ) 


Jn3  2uMh2(t-yM)  Sin  i^2  ^  Jn4/2w* 1  'h^tV  ’  ’)Sin  ^  ) 

I  “  J  (  ~  | 


^(n^nj+n^Xngt+x^  +  j  (n2y-n3p  '-K^y '  ’)  nfi  e+J(n3“  n4>X 


8 


eXP{  dBl|?  '  Si  -jMS-  =M  <)[Jx2(rV-n2'n3>-nVU'^^") 


where 


[4] 

Q  (n1>n9,n-fnA,y,M,,y,,> 
Xlx2  L  J  4 


(5,4-10) 


exp{+j  [n1x1(t)+n2x1(t-p)+n3x2(t-y  ,)+n^x2(t’’V  * ')  1 ) 

is  the  fourth  order  characteristic  function  for  the  time  varying 
positional  phase  fluctuations.  Carrying  out  the  average  over  the 
"initial"  phase  xq  we  find  that  only  those  terms  for  which 

n^^  -  n2  +  n3  -  n^  -  0  (5,4-11) 
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are  retained,  thereby  yielding  fourth  order  stationarity .  It  Is  also 
clear  that  for  any  four  numbers  that  satisfy  this  condition, 

the  numbers  -n^,  -n^-n^-n^  satisfy  it  too.  For  example,  the  following 
are  the  possible  combinations  for  In^J+InjI+ln^  +  ln^l  ■  2: 


nl  °2  ll3  n4 

0  0  0  0 

110  0 
0  0  11 
10  0  1 
0  1  1  0 

10-10 
0  10-1 


-1-100 
0  0-1-1 
-10  0-1 
0-1-1  0 
-10  10 
0-101 


Figure  5.4-1 


The  proliferation  of  terms  beyond  these  of  second  order  imposes  a 
practical  limit  on  the  usefulness  of  this  model  to  weak  scattering. 

In  order  to  apply  the  expansion  (5.2-4)  for  the  Bessel  function 
to  (5.4-9)  we  must  first  isolate  Bessel  functions  of  negative  order  by 
utilizing  the  identity 

J_nW  -  (-1)"  Jn(z)  (5.4-12) 

This  can  be  rewritten  as 

Jn(z)  .  j(in|-n)J|n|W  (5.4-13) 

which  leaves  the  Bessel  function  unaltered  if  n  is  positive  but 
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yields  the  appropriate  factor  of  (-1)”  if  „  is  „eEatlve.  Using 
this  (5.4-S)  becomes 

“JTai  Cc^-io*)— jre2 Co)'  ’-w*  *  •) 


*lj  (w.‘‘>,»w,\wM',v,v")  - 


I  i  i  i 

V»VVV  '  1  InJ^dnJ+oA 

Dn.S  v'nJS>  exPOB1(!i  -  !il  +  jB  A  -  \  ]  }  ej(n3"nA)x, 

lw  w1/  *lw»»  w'»«' 


5 


..4 


( r  +  r') 
0  o 


nl’’n2+n3“nA' 


A(m^,2|n^  |+i) 


] 


[4]  . 

^Xlx2  ^nl*“n2,n3“n4»v»v? »v' ’)  exP^j(n2v-n3v'+n^v’ ')} 


(5.4-14) 


where 

jy.  =  {topto’ ,w' ' fu,M}  (5.4-15) 

and 

~  "  ^1,m2,m3,m4)  n  -  {  n^n^n^}  (5.4-16) 

The  are  nonients  computed  using  the  distribution  for 

h“  {b1(t)>h1(t-v),h2(t-v,),h2(t-v’ ')}  (5.4-17) 


modified  in  a  manner  anologous  to  that  used  in  (5.2-11).  We  assume 
that  h  is  Gaussian  with  a  correlation  matrix  R^^v)  which  we  here 

abbreviate  simply  as  R.  Again,  using  transformations  similar  to  those 
used  to  obtain  equation  (5.2-11)  we  have 


V*  m(£> 
n,m  — 


h1(t)2ml+l"ll  h1Ct-v)2m2+ln2t  h,<t-P  ')2“3+|n3f 


h  <>-u»  »\2m4+|n4|  -**hTR' “1h 

h2(t  v  }  e_.-7.-n  -  dh]L(t)  dh^t-v)  dh^t-v’)  dh  (t-v”) 


(5.4-18) 


/(2npT]F] 
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r 


where  the  correlation  matrix  ^  (which  i.  .  functlon  o£  boCh  the 
"l  8nd  °£  the  £«0“e"cie6  «)  for  the  modified  density  is  fomed  as 

follov/9  : 


(W  +  if1)"1 


(5.4-19) 


Here  the  matrix  is  «  dlagonal  Mtrll  uhlch  contalng  thfi 
dependence  on  frequency  : 


The  normalizing  constant  of  the  modified  density  D'  (a.)  of  equation 

n,m  1 

(5.4-14)  is  given  by 


D' 

n,m 


(5.4-21) 


where  is  the  universe  of  R/ .  Expressions  for  the  moments  generated 
by  equation  (5.4-18)  may  be  derived  by  setting  the  equivalent  order 
cumulant  for  h  equal  to  zero  since  the  parent  distribution  is 
Gaussian.  Lanning  and  battin49  present  the  general  moment  as  a 
com  ination  of  second  order  moments.  From  this  fact  alone  we  may 
conclude  that  similar  to  equation  (5.2-19)  one  may  write  these 


i. 
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moments  as  polynomials  divided  by  the  determinant  of  U  raised  to 

-n 

various  powers  : 


,(n) 


P„  „(u) 

n,m 


n  ,m  — 


l»J‘ 


Zk  |n1l+m1 


(5 . A— 22) 


If  we  retain  only  those  terms  for  the  values  of  the  tabulated 
in  figure  (5 .  A— 1)  then  only  second  order  moments  are  needed  and  these 
are  the  elements  of  the  matrix  itself.  Using  only  to  the  second 
order  in  frequency  times  vraveheight 


*{**(■....■  ...".n'W)  e-3'-e:(M-M')+JTB2(M"-<‘|,")  111*  r  (5.4- 
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+  "Hiio1  R23(0110)  0$2<o,-i.i.ofv> 
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Cos(n  (v-vf)+X)  exp{-j /__l-_^.  r 
8  8  iW’  con'- 
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where  • 
R‘  given  by 


(v,v',v"),  and  where  the  R^(n)  ere  elements  of  the  matrix 


Ri  j  <£) 


I2nl 


(5.4-24) 


The  ll^l are  the  ij-th  minors  of  the  determinant  |u  |  and  (5.4-24) 
is  a  special  case  of  (5.4-22). 

Of  crucial  Importance  to  the  performance  of  various  integrals 
for  the  variance  calculations  is  the  functional  dependence  of  |u| 
on  the  frequencies  u>  siu^e  this  determinant  is  raised  to  a  fractional 
power  in  all  of  the  terme  in  (5.4-23).  We  denote  the  Inverse  of  the 
correlation  matrix  for  the  waveheights  R  by  H  and  define  normalized 
frequencies  in  this  case  by 


26 

fnj  +1 


o^  Sin  ift  w 

. . 

c 


(5.4-25) 


with  similar  connections  for  the  pairs  (Sn2»w') » an<1  ^n4’W 
The  constant  6  is  chosen  later  through  convergence  considerations. 


Mn+C 

11 

M12 

M13 

M14 

M21 

M22+5n2 

r~ 2 

M23 

M33+?n3 

< 

M24 

m 

M31 

«32  Boh 

M34 

M41 

M42 

M43  H 

44+5n4 

Just  as  the  minor  |m| is  formed  from  the  determinant  |mJ  by 
the  i-th  row  and  j-th  column,  we  define  a  second  order  minor 


deleting 

ij.ki 
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which  Is  formed  by  deleting  both  the  i-th  and  k-th  rows  and  the  j-th 
and  \-th  columns  from  |m|  •  For  example , 


12121.43* 

M12 

M« 

(5.4-27) 

M32 

M34 

Similarly,  minors  of  third  order  |m| 

lj,kt,mn  are 

the  elements  of  H 

Itself.  Using  this  notation, 

12-1  - 


S2  K2 
nin1 

•X 


|H| 


U.JJ 


E  E  E 

ni  "V Sc 

b3.6 

B  oh 

+  '  < 
L  _ J 

|M| .  +  |M| 

L 

|M| 


(5 • A— 28) 


a°h 


where  again. 


M-jf1  (5.4-29) 

•  Unfortunately,  this  polynomial  is  of  eighth  order  in  frequency. 
Although  for  the  applications  of  interest  in  equations  (3.5-16)  through 
(3.5-18)  the  squares  of  various  frequencies  are  taken  equal  in  pairs, 
(5.4-28)  is  still  more  formidable  than  the  denominator  of  (5.2-21). 
However,  we  may  use  a  series  expansion  again  which  is  valid  for  all 
frequencies  and  also  produces  convenient  factorization  of  frequency 
dependence 


1  _ 

(|U  ||R|)v 

■  ii  — 


KC*  +1)(E^  +1)  (E*  +1)  (fi?  +1)V 
nl  n2  n3  n4 


(  1  +  _ j  v 

iu2  +du2  +i  )(52  +i  >u2  +D) 

nl  n2  n3  n4 


(5.4-30) 
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where  j>  •  (p^^fPjtP^)  from  (5.4-32) 


1  '8VS  *S’Vir  ‘  | s1  c"22  5"33  ^ 

•  (5.4-34) 

We  may  also  write  (5.4-21)  in  terms  of  the  normalized  frequencies 


5^  (l"1l+D1|  l»ll  +2m1- 
TT"  f 

*°h 

(5.4-35) 


Just  as  in  the  case  of  the  second  order  moment,  the  coefficients  In 
(5.4-32)  are  singular  for  v  -  (°»T0»T0)  where 


To"  Tsl  ~  ts2 • 

in  the  mean  path  delay  difference  for  the  two  receivers.  However, 

product  |r|£  Injj+Tni  G  (p,v)  remains  finite. 

—  in,  m  — 

Finally ,  we  summarize  by  writing 


(5.4-36) 

the 


D'  v'  (n) 
n,m  n,m 


1  Gn,m(2-’^  Cr(^  {  |  |  [ 


r  '  !  ,R,Wnil+mi 

r‘0  r! 

4  ( |n± |+1)  ^  |ni|-Hn±  (p^^q^+lnj^  ^ 

n  1  r  ^ 


(5.4-37} 


£  1 


i«i 


6  o 


[(«n  +1)(?n  +1)(?n  +1)<U  +1) ]E(’slnll+mi)+rt,s 
nl  2  3  n4  * 


The  results  may  be  substituted  into  (5.4-14)  to  complete  that 
expression. 
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CHAPTER  6 


POSSIBLE  EXTENSIONS  OF  THE  PRESENT  WORK 
AND  SUGGESTIONS  FOR  FUTURE  RESEARCH 
6.0  Extensions  to  the  Fully  Random  Boundary 

In  chapter  5  we  have  considered  the  scattering  surface  to  be  a 
sinusoidal  boundary  with  random  parameters.  More  generally  we  are 
interested  in  an  arbitrary  random  boundary  C(x,y,t)  with  known  statistical 
properties.  In  particular,  with  the  assumption  of  gaussian  statistics, 
only  the  spatial  and  temporal  correlation  function 

Yc(5,n,T)  -  C(x,y,t)  C(x-5,y-n,t-t)  (6.0-1) 

need  be  specified. 

The  coherent  transfer  function  is  easily  obtained  from  (1.4-3)  in 

the  limit  B  (0,4)  +  1 
© 


HCM 


,eMvro)  /c 

(r  +r 1 ) 
v  o  o' 


Q  (o>Sin4/c) 
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2  2  c.  2, 

a)  Oj.  Sm  4 


(6.0-2) 


The  result  is  the  same  as  Eckart’s  reflection  coefficient  (1.2-21)  for 
directional  reception. 

Unfortunately,  equation  (6.0-2)  is  about  the  only  trivial 
statement  one  can  make  about  the  fully  random  surface  model.  If  we 
advance  to  second  order  moments  we  immediately  run  into  difficulty. 
From  (1.4-13)  we  have 
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(60-3) 


5(x2,y2,t-x)  w'j  Sin^/c" 

dxi  dx2  **1  dy2 

A8ain,  to  simplify  matters  we  take  Be(e,«  -  i  and  assume  gaussian 
statistics  for  5  to  get 


♦(u.uji)  -  -4mm'  S^2»  t  1  I2 

c1  *o*o 


0,  Sin*  2 

e-« - c - J  (“  +“'  )  e-jt  («-«•) 


.00 


eXP{"jj(“Xj  -  «t'X*)  (C°)  ♦  (ay*  .  u.y|)j  _1_  J 

e 


,expf~?  Vxi'x2-yr>VT)  Si"2  *) 


dXl  dx2  d*l  d*2  (6.0-4) 

H.e  most  difficult  feature  of  this  integral  is  the  dependence  of  » 
only  on  the  differences 

X1  ’  x2'  *1-/2 

whereas  the  remainder  of  the  integrand  is  dependent  on  the  quantities 

WXj  -  u'jc* 

and 

2  ,  2 
u)y1  -  w’yj 
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In  certain  cases  transformations  or  reductions  of  the  integral  in 
(6.0-4)  are  possible.  For  example,  if  only  the  value  of  for 

w  *  w*  is  desired,  then  one  may  write 

2  2 

u(xl  -  x2)  «  wCXj  ♦  X2)(xx  -  x2)  (6.0-5) 

2  2 

wCyj  -  y2)  “  w(yx  ♦  y2)(y!  •  yg)  (6.0-6) 


Making  the  substitutions 


"we  find  that 


C  ■  xi  -  x2 


■  yx  -  y2 


V  **  Xj  +  x2 


n*  ■  yx  ♦  y2 


♦(-...»>  -  (^qi^)2  (pir)2 

0  o 


(6.0-7) 


0-  (?c  Sin»  uj2 


.to 


exp{-j  [«•(#  ♦  «.']£} 


(6.0-8) 


xexp{(^int)  »K.n.i)J 


d5  dn  d£*  dn' 


One  can  immediately  perform  the  integral  over  the  primed  coordinates: 


5-j[cC'  (c°)  ♦  nn']  w/cRe 


dt'  dn' 


(6.0-9) 


*(C°)  0) 

‘Hfc-)  ® 


cRe' 
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Thus, 


j  .  l^cCMSin»  2 


where 


<Vr(P 


y  Co,o, t) 

vt) "  — 


)  [i  -  p,  (*)])- 


(6.0-10) 


(6.0-11) 


As  a  corollary  one  can  conclude  that 


fl»(w,u>,6)  s  H(o),t)  H(w,t)* 


(ro*TP 


(6.0-12) 


This  last  result  is  true  due  to  the  stationarity  of  C  and  does  not 
depend  on  the  gaussian  assumption.  It  does  however,  iLaw  very  heavily 
on  the  truncation  of  the  expansion  for  the  exponent  in  (1.4-13)  at 
quadratic  orders  in  x  and  y  and  at  first  order  in  C.  While  (6.0-12) 
gives  the  very  simplistic  average  energy  transmission  which  is  flat 
over  an  infinite  bandwidth,  it  does  at  least  conserve  energy. 
Furthermore,  from  (2,4-30)  it  follows  that  the  autocorrelation  function 
for  a  wideband  signal  is  left  largely  unaltered  by  this  scattering 
model. 

These  ;  atements  are  quite  general  and  qualitatively  correct. 
However,  nothing  much  can  be  said  about  the  fluctuation  of  the 
correlator  output  for  the  multipath  detector  unless  we  can  obtain 
4>(o),u>!t)  for  id  i  U)’  in  (3.3-12)  for  I^(t,tJv).  In  this  case  the 
transformations  (6.0-5)  and  (6.0-6)  are  not  possible  one  must  be 
prepared  to  do  each  integral  over  y^  and  y ^  separately.  This 

is  clear  since  either  w  or  u>'  may  be  chosen  to  be  zero  independently. 
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In  any  case,  if  it  is  desired  to  describe  the  decrease  in  the 
plateau  of  variance  of  the  correlator  output  for  long  integration 
tines  one  must  have  an  expression  for  V(5,n,T).  This  in  itself  is  a 
non-trivial  task. 

In  general  y^(C,n,T)  satisfies  the  same  wave  equation  as  c(x,y,t). 
If  we  choose  to  ignore  dispersion  in  these  waves  then 

-2  i 

— 2  ♦  — J  "  “T  — ?  1  'rr^*n  T)  “  0 
85  3n  c  3t  5  (6.0-13) 

Therefore,  the  behavior  of  V^n,!)  as  a  function  of  t  for  a  given 
^Ctjn,o)  is  that  of  an  initial  deformation  on  an  infinite  membrane. 
The  progression  of  ®  (5,n,t)  is  therefore  determined  by  a  superposition 
of  linear  surface  waves, 


•0 

V«.n.T>  '  f  J  E?(qx,qy)  J  [V  +  V  -  "(VV  T]  dqx  dq 


-00 


where 


fl(q  ,q  )  e  c  ^1-q^-q^ 

Vnx»My^  s  HX^y 


y 

(6.0-14) 

(6.0-15) 


and  cg  is  the  propagation  velocity  for  surface  waves.  The  quantity 
E-(q  »q  )  is  the  space  spectrum  5  for  ?  and  is  determined  from 

v  a  y 

T5K.»l,o) 

Ec(qxqy)  *  (ji)2  /  (  ft(€.n.o)  e-J  <W  dqx  dqy 


•  00 


(6.0-16) 


The  integral  (6.0-14)  is  often  difficult  to  perform. 
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In  order  to  gain  insight  into  the  behavior  of  let  us  assume 

that  E  is  rotationally  symmetric  in  the  q  q  plane  so  that 
’  x  y 


Vqx'V  a  Ec 


(6.0-17) 


q  ■  /  2  2 

W 


(6.0-18) 


We  may  then  write  (6.0-14)  as 


Vc(r,T)  =  J  E^(qxqy)  J0(p-) cos(qT)qdx  (6.0-19) 


For  example,  assume  that 


Ec(q) 


,  2/ .2 

rhq  /A 


(6.0-20) 


Substituting  (6.0-20)  into  (6.0-19)  and  using  the  expansion  (5.2-3) 
for  the  Bessel  function 


Vr'T>  *  £  (  (jF*)2” 

4  m=0  s 


(6.0-21) 


2  j  2 

q2m  e"2“  (“7  +  “T)  Cos  dq 
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where  the  D  are  parabolic  cylinder  functions.  For  small  r  and  t  the 

2 

first  term  is  informative:  t 


(6.0-22) 

Hence,  a  circular  ridge  of  correlation  propagates  outward  with 
increasing  t. 

The  combination  of  this  circular  symmetry  with  the  cartesean 
geometry  implicit  with  (6.0-4)  presents  many  difficulties. 

A  less  ambitious  approach  to  (6.0-4)  would  assume  (  to  be  a  random 
corrugation  rather  than  a  general  irregular  surface.  In  that  case  the 
correlation  function  could  be  written  as  a  sum  of  a  "left"  going  and  a 
"right"  going  component  just  as  one  treats  the  propagation  of  waves 
on  a  string.  Further  simplifications  arise  if  the  corrugation  is 
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r 


aligned  with  either  the  x  or  y  axis  for  then  one  may  perform  2  of  the 
space  integrals  with  great  ease. 

For  example,  if 


*ctt,n,o)  =  VcCC) 


(6.0-23) 


then  (6.0-4)  becomes 


4>(oo,oo  Jt) 


2oo  Sin  [p  r  1 


f  1  ;  1 

lr  r"  ( t  +r') 
0  0  o  0 


exp{  -li(-L|Hi)2  [u2  ♦  u.2]  -jTs(u-u')} 

e*2  B  - 


r  WOO ' 


+  Cl) 


♦  kr(  (  [xj  -  x2]  -Ct) 


}  dx.  dx, 

1  d 


(6.0-24) 


We  will  not  pursue  this  integral  further  since  an  adequate  treatment 
would  be  somewhat  lengthy. 

Instead,  we  confine  our  attention  for  the  remainder  of  this 
section  to  the  fourth  order  moment: 


2  (u>,u),,u>,,,<1>,",p,y,,u")  =  (^rr) 
*  *  00 


16  w  to '  w"  to"'  Sin2^.  Sin2\|^ 


rjTSl(^’)  “  jTs2(t0"-U)"  ') 
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////////  dXl  dX2  dx3  *4 
.00 

d/l  dy2  dy3  dy4 

BeCO^+p  Bo*(02,4>2)  Be(e3^3)  Be*(04,4>4) 

exp{-j  [wx2  -  w‘  (x2+Ax)2]  Sin2^1) 

exp{-j  [toy2  -  w'Cy2+Ay)2J} 

exp{-j  [o>*  'x^  -  u,M  (x^  +  Ax)2J  Sin2^} 

exp{-j  [o>"y2  -  (y4  +  Ax)^|} 

exP{^-  [wCjCXj^.t)  -  w’CjCXj+Ax.y^Ay^-p)]  Sinif^ 

“  •  ;  • 

^v{~  [_«"  C2  (x3,y3,t-y»)  j 

i . " _ 

C2(x4+Ax,y4+Ay,t-y' ’)j  Sinif^  _ 

(6.0-25) 

where  Ax  and  Ay  are  the  x  and  y  displacements  of  the  specular  points 
for  the  two  receivers.  We  assume  that  the  grazing  angles  ^  and  ^2 
and  distances  Re^  and  Re2  "seen"  by  the  two  receivers  may  be  slightly 
different. 

This  8-fold  integral  contains  the  4-th  order  characteristic 
function  for  ^  and  C2-  This  in  turn  is  a  function  of  the  12 
differences  in  coordinates 
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Clearly  such  an  integral  poses  a  formidable  challenge.  Yet  in  order  to 
be  of  any  real  use  this  integral  must  be  performed  for  a  wide  range  of 
frequencies.  Furthermore  for  the  applications  discussed  in  chapter  3 
this  integral  must  itself  be  integrable  as  a  function  of  frequency. 

The  8- fold  space  integral  of  (6.0-25)  may  be  reduced  to  a  4-fold 
integral  by  restricting  attention  to  corrugations  but  we  will  not 
write  this  result  here. 

We  have  here  barely  written  down  the  relevant  integrals  for  the 
fully  random  model.  While  at  the  time  of  this  writing  some  progress 
has  been  made  by  the  author  with  equation  (6.0-4),  nothing  has  yet 
appeared  in  the  literature  approaching  a  solution  of  (6.0-25).  Both  of 
these  integrals  are  presented  only  as  suggestions  for  future  research  or 
motivations  for  experimental  studies. 
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6.1  Other  Suggestions  for  Future  Research 


In  closing  it  would  be  worthwhile  to  indicate  a  few  other  areas 
of  research  which  have  only  been  treated  briefly  in  this  work.  These 

are  listed  in  ascendi.ng  order  of  complexity: 

» 

1)  Analysis  of  large  arrays.  Some  relevant  material  on 
this  problem  is  presented  in  Appendix  J. 

2)  Derivation  and  computation  of  cumulants  of  higher  order  than 
the  fourth.  This  research  would  be  useful  for  investigations  of 
optimal  receiver  design.  Since  the  received  signal  Is  non¬ 
gauss  ian,  the  optimal  receiver  would  presumably  be  non-quadratic. 

3)  Narrow  band  detection.  While  the  applications  studied 
in  this  work  treat  low  pass  types  of  spectra,  the  results  of 
chapter  3  are  general  enough  to  be  used  in  a  study  of  the. 
effect  of  the  frequency  smear  due  to  the  time  variation  of 
the  scatter!  *», 

4)  Multiple  surface  reflection.  In  thi9  case  some  of  the 
discussions  in  chapter  2  with  respect  to  cascaded  RLTVP's  is 
useful  for  the  narrow  band  case. 

5)  Analysis  of  scattering  at  low  grazing  angles.  In  this 
case  the  active  scattering  region  Is  not  localized  to  the 
vicinity  of  the  specular  point75,76,  Also,  shadowing  and 
multiple  scattering  become  important  unless  the  surface  is  not 
very  rough. 

Of  these  topics  the  second  one  concerning  optimal  receiver  design  is 
probably  the  most  interesting.  Presumably  such  a  detector  would  be 
self  adaptive  with  respect  to  the  slow  axis  time  variations  of  the 
scattering,  perhaps  identifying  some  of  the  coherent  properties  of  the 
scattering  model.  It  should  be  possible  to  show  that  that  the  information 
lost  by  constraining  the  correlator  detector  to  be  steered  "on-target"  * 
would  be  captured  by  the  optimal  detector. 


*As  discussed  in  Appendix  D- 
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APPENDIX. A 

MORGAN'S  DERIVATION  OF  EQUATION  (1.3-14) 


The  transition  from  Equation  (1.3-12)  to  Equation  (1.3-14)  is 
facilitated  by  executing  the  partial  derivative  of  q  with  respect  to 
n  and  rewriting  the  partial  of  q  with  respect  to  t  in  terms  of  the 
total  time  derivative.  From  the  definition  of  the  auxiliary  wave  q 
in  Equation  (1.3-9)  we  have 


111.  .¥*.(£>. 

c  3t  r 


4*ct-ct  +r 
o 


la  .  _  a  +  1  la 

3r  r  c  3t 


(A. 1-1) 


where  W'(0  is  the  derivative  of  the  narrow  pulse  function  W(£)  with 
respect  to  its  argument,  and  t  and  r  are  regarded  as  independent 
variables.  Next  we  have 


la  .  la  li  .  /  a  +  1  lay  lr 

3n  3r  3n  r  c  3t  3n 


(A. 1-2) 


and  by  straightforward  application  of  the  chain  rule 

Ida.Ila  +  laldr.Ila.  /  a  +  1  la>  1 ; 

c  dt  c  3t  3r  c  dt  c  3t  v  r  c  3t'  c 


(l+r/c)  i  |a  -  at 
c  3t  cr 


(A. 1-3) 


Solving  this  for  3q/3t  and  also  substituting  into  Equation  (A. 1-2) 


(a) 


11a 

c  3t 


(k  + 

vc  dt 


cr 


(l+r/c) 


(b) 


la 

3n 


(k  _  £,  3r. 

c  dt  r  3r. 

(1+r / c) 


(A. 1-4) 
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Substituting  Equation  (A.l-4a,b)  into  Equation  (1.3-12)  we  have  directly 


ps(P-£o) 


-// 


dx  dy  , 
4n  1 


m 

i 


3p  v  dp 

(3n  +  2  3t 
c 


a  tv 

(^£. _ SL) 

v3n  2  ;  p 

+ - ^ - s 

r  7 


(l+r/c) 


xy 


(A.  1-5) 


-  7  sf  <— ■  :c  >  p.]  Hdt) 

C  dt  (l+i/c)  8 

**C(x,y,y) 


where  H  is  defined  in  Equation  (1.3-13).  Equation  (A. 1-5)  is  the  sum 
of  two  terms  one  of  which  is  multiplied  by  q  and  the  other  of  which  is 
multiplied  by  dq/dt  .  Now  in  the  limit  of  very  narrow  W(£) 


q 


mi 

r 


1 

r (l+r/c) 


6(t  - 


(A. 1-6) 


Using  this  the  first  term  of  Equation  (A. 1-5)  is  easily  reduced  to  the 
first  three  terms  of .Equation  (1.3-14).  The  second  term  in  Equation 
(A. 1-5)  is  similarly  reducible  after  integration  by  parts  and  utilizing 
the  fact  that  q  is  zero  for  t  ■  +»  . 
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APPENDIX  B 


THE  PLANE  WAVE  EXPANSION  FOR  SURFACE  SCATTERING 

Meecham's  derivation  of  the  plane  wave  expansion  for  surface 

scattering  is  readily  extended  to  surfaces  with  two-dimensional  and  even 

slowly  -arying  irregularities.  The  analysis  proceeds  from  an  expansion 

40 

given  by  Levine  and  Schwinger  for  the  free-space  point  source  Green's 
function  exp(jkr)/r  which  occurs  in  Equation  (1.4-3) 


4-M 


jlk  (x  -x  )+k  (y  -y  )+k  ( 2  -2  )) 

s  '*  17 . .  S  P  ■  dk  dk  dk 

(k2  +k2  +k2  -  k2)  x  y  z 

x  y  z 


JT* 


(B.l-l) 


j[k  (x  -x  )+k  (y  -y  )+/k‘  -k2  -k2  (z  -z  )] 

4  rr  P  x  s  p  y  s  V-  x  y  s  P'J 

e -  dk  dk 

2jj 

x  y 


where 


r  -  /(x  -x  ) 2  +  (y  -y  ) 2  +  (z  -z  )2  (B.l-2) 

s  p  op  SP 

and  where  in  the  context  of  Equation  (1.4-3)  ^(Xp>yp»Zp)  *s 

observation  point  and  S(x  ,y  ,z  )  is  a  point  on  the  surface. 

8  S  S 

The  sign  in  Equation  (B.l-l)  is  assigned  as  follows.  Note  that 

free-space  Green's  function  represents  an  outgoing  wave  emanating  from 

P  .  This  wave  must  be  "upgoing"  when  z  >  z  and  "downgoing"  wheu 

c  p 

zg  <  Zp  .  The  result  of  the  second  equality  in  Equation  (B.l-l)  is  in 
the  form  of  a  plane  wave  expansion  for  this  outgoing  wave  provided  the 
"+"  sign  is  used  when  z  >  z  and  the  "-n  sign  is  used  when  z  <  z  . 

6  P  S  p 

This  position  dependence  of  the  space-spectral  density  of  the  plane  wave 
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expansion  for  the  point  source  Green's  function  leads  to  a  similar 
dependence  in  plane  wave  expansions  for  surface  scattered  waves. 

Substituting  Equation  (B.l-1)  into  (1.4-3)  we  immediately  have  the 
plane  wave  expansion  for  the  scattered  signal  received  at  P 


ps(p’t0)  -//vvVW  e 


-J[k  x  +k  y  -k2  -k^  7  ] 

x  p  y'p  x  yp 


dk  dk 
x  y 


■  /  /  A  (k  ,k  ,2  ,t  ) 
JJ  -  x  y  p*  o 


-j[k  x  +k  y  -k^.  -k^  z  ) 


(B.l-3) 


x  p  y  p 


x  y  p  dk  dk 
x  y 


in  which  the  (unknown)  wave  density  is  given  by 


vvvw  ‘hff  (?if£  +  G±k  lps(y->,>z-vt/c)])  dS 


S— (t  ,Z  ) 
r  o  p 


(B. 1-4) 


Here  S— (t^z^)  is  that  portion  of  the  appropriately  retarded  surface 

S  (t  )  which  lies  above  z  (for  the  "+"  sign)  or  below  z  (for  the 
r  °  p  p 


n  n 


sign)  and 


G,(x  ,y  ,z  ) 
+  s'ys  s 


j[k  x  +  k  y  + 
x  s  y's  — 


+  JT- 


.2  d2  1 
k  -k  z  ] 
x  y  s 


i 


.2  ,2  .2 
k  -k  -k 
x  y 


(B. 1-5) 


Equation  (B.l-3)  shows  that  when  portions  of  the  surface  lie  above  and 
below  the  level  z^  of  the  receiver  the  density  of  upgoing  and  down¬ 
going  waves  becomes  position  dependent.  However,  when  z  <  z  for 

P  s 

all  z  we  have  only  downgoing  waves  and  the  density  of  these  plane 
8 
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waves  becones  poaltlon  Invariant  (aec  Figure  B.l-1): 

A-<kK*y  V*0>  -  0 
Vkx-VW  -  A(kx'ky-to) 


(B.l-6) 


This  argument  which  is,  of  course,  only  approximate  in  the  case  of  the 
time  varying  surface  is  seen  to  be  exact  for  the  fixed  surface  case. 


(a) 


(b) 


Situations  for  (a)  Position  Dependent 
and  (b)  Position  Invariant 
Plane  Wave  Expansions 
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APPENDIX  C 


SPECULAR  POINT  EXPANSIONS  POR  r  AND  r1 


In  order  to  simplify  some  of  the  mathematics  necessary  for  obtaining 

the  expansions  foj:  r  and  rf  about  the  specular  point  we  follow  the 

38 

development  used  by  Gulin.  We  begin  by  defining  P(Xp,0,z^)  to  be  the 

observation  point,  Q(x  ,0,z  )  to  be  the  source  location  and  S(x,y,t) 

<1  Q 

to  be  a  point  on  the  surface.  The  origin  of  coordinates  is  taken  to  be 
at  the  specular  point  (see  Figure  1.4-1).  Let  x^  be  the  x-axis 
component  of  the  distance  between  the  source  point  Q  and  the  surface 
point  S  .  If  L  is  the  x-axis  separation  between  P  and  Q  then 


x,  -  x  -  x  -  x  +  ( - — )  L 

1  0  z  +  z 

q  p 


(c.i-i) 


First  we  expand  r  and  r'  to  second  order  in  C  to  obtain 


I  1  2  , 

r  ■  rx^  +  y  +  ( 


*P  -  o' 


z_4  o 

R  -  -S- +  [1  -  <*  / R)  ]  +  .  .  . 


2R 


•'  -  /(L-x1)2  +  y2  +  ( 


Zq 


*5 


,.s2 


R’  -  -Jr  +  [1  -  (zq/R’>  ]  2R1"  +  ‘  * 


(C. 1-2) 


(C.l-3) 


where 


(a)  R  •  (x2  +  y2  +  z2)*5  ;  (b)  R'  -  [(b-x,)2  +  y2  +  z2]**  (C.l-4) 

ip  i  .  q 

Here  R  and  R'  are  respectively  the  distances  from  the  projection  of 
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the  surface  point  S  onto  the  plane  C  -  0  to  the  points  P  and  Q  . 
(see  Figure  C.l-1). 


The  Definitions  of  R  and  R' 
Figure  C.l-1 


We  next  expand  R  and  R*  in  powers  of  the  displacements  x  and  y 
from  the  specular  point.  First  note  that  if  ^  is  the  grazing  angle 
of  incident  radiation  at  the  specular  point  then 


CtnC^) 


z  +  z 

p  q 


(C.l-5) 


Thus,  (C.l-4a)  becomes 


R  - 


[(x  +  z  Ctn(iJ/  ))2  +  y2  +  z2]** 
pin 


2  2  k  x2  +  2z  x  Ctn(ij>  )  +  y2 

[Z;  (1  +  Ctnz<*  ))]*  •  [1  +  I* - 5 - B - s-* - 

P  zp  U  +  CttTC^)] 


.  (2  z  x  Ctn(i|>  )) 

-J - 2 - 1 - +  i 

8Zp  U  +  ctn Z(^)] 


(C. 1-6) 
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Next,  defining  rQ  to  be  the  distance  from  P  to  the  specular  point 
have 


we 


r0  "  zp  U  +  Ctn(ip^)  J 


-  zp  Csc(\|^1) 


(C. 1-7) 


Substituting  this  into  Equation  (C.l-6) 


R 


2 

ro  +  T~  +  *  Cos(1,1)  + - 2r — ~i~  +  ot 


x2  Sin2(i)i  ) 


2  2 

(x%y  )z 


px 


(C. 1-8) 


In  a  similar  manner  from  Equation  (C.l-4b) 


R' 


[(x  -  z  Ctn(i|/.))2  +  y2  +  z2)* 
q  i  a 


2  2 
2  7  l,  x  -  2z  x  Ctn(iK  )  +  y 

K  (1  +  Ctn2(,f,  ))]*  .  [1  + - -S - 1 - 

2zqZ[l  +  CtnZ(^)J 


(C. 1-9) 


l  (2  zqx  CtnCif^)) 

A  2  o  ^ 

8zq  [1  +  Ctn^(*t)Jz 

Finally,  defining  the  distance  from  Q  to  the  specular  point  to  be  r1 
we  have 


r’  «  z  Csc(ib . ) 
o  q  i 


(C.l-10) 


and 


R  -  r;  + 


x2  Sin2 (if/. )  (x2+y2)z  x 

-  x  CosC^)  + - - +  o[ - ^-] 

°  (r;) 


(C.l-11) 


hext,  examining  the  reciprocal  of  r  we  find 
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p 


11  z_£ 

7  '  R  11  +^2"+  •  •  •  1 


o  r 

o 


2  x  Cos  (if).) 

J—  •  ^ 


X2  Sin2(i|li) 
2  r2 


+  .  .  .  ] 


fai 

2 

r 

o 


(1  +  — 5—  +  .  . 


(C.l-12) 


r[1-^  + 

0  r 

o 


^2  x  Coa(»(»1) 


x2  Sin2(if).)  z  c 

. . . r  .1  4.  -  P— .  4. 

2  2  * 

2  rZ  r 

0  o 


.  J 


Similarly, 


1  1 


77  u- ^ 

o  r ' 


2  2 

x  Cos  (if)  )  x  Sin  (if;.)  25 

- — - - - - +  “^2  +  •  .  .] 

ro  2  r r,Z 


(C.l-13) 


Multiplying  these  two  reciprocals  together  we  obtain 


7T  -  ~r  [1  +  X  Cos^)  (i-  -  ir) 

00  00 

"  (^2  +  "^y)  [y2  +  x2(C082(\p  )  +  Sin2(if)  ))  - 

v*  v  ^  * 


(C.l-14) 


r  r 
o  0 


zpC)  +  •  •  ] 


where  terms  up  to  second  order  in  x  and  y  and  up  to  first  order  in 
C  have  been  retained. 

The  sum  of  r  and  r*  can  be  found  to  second  order  in  a  similar 


manner : 


[y2  +  x2  Sin2(\f>.) ] 

r  +r*  ■  r  +  r*  + - - - - — 

00  R 


2,  1 


-  2  C  SinCify) 


(C.l-15) 


+  r  i  u-<yr0n  2t+[i-  Vro>2)  i^1  +  ••• 

o  n  o 
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The  second  order  terra  in  ;  in  Equation  (C.  1-15)  will  not  be  of  importance 
in  forming  jk(r+r')  in  the  exponent  of  Equation  (1.4-10)  provided 

2 

2Tt£os  (^)  JL-.  «  i  (C.  1-16) 

e 

where  h  «  Max( | C(x,y ,t) | )  and  X  *  2v/k  . 

Next,  we  evaluate  the  various  partial  derivatives  of  r+r' 


3x 


{ r+r '  ] 


2  sin2(*  )  <|->  -  (r-  +  rr)  smV) 

e  ro  ro 


-  h  a  (x,y)  (c°) 

D  O 


(C.  1-17) 


17  lr+r'! 


2  <r>  ■  ^  + 

e  oo 
-  b  (x,y) 


:  [b(x,y)  +  b*(x,y))  (-1) 


(C. 1-18) 


— •[r+r')  -  2  Sin(^±)  -  c°  (C.l-19) 

where  a(x,y),  b(x,y),  c(x,y)  are  the  x,y,z  direction  cosines  for  r  , 
and  a'(x,y),  b'(x,y),  c'(x,y)  are  x,y,z  direction  cosines  for  r'  . 

The  subscript  "s"  on  a,b,  or  c  denotes  the  sum  of  primed  and  unprimed 
quantities  while  the  superscript  "o"  denotes  the  evaluation  of  these 
quantities  at  the  specular  point.  Here  we  have  used  the  following: 


x  -  x 
_£ _ 


r 

o 


a(x,y),  ^  =  a0  ; 

o 


J  5  ”  b(x,y)  , 
o 

Using  Equations  (C.l-14), 


a’U,y) , 


=  -  b'(x,y) 


(C.  1-20) 


(1.4-7)  and  (C.l-17)  through  (C.l-19)  we 
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obtain  the  following 


Tr1"  fn^r+r’l  5  7"V,[1  +  *a°  “  a’°  "  a(x»y)  +  *'<*»y)l  a°  +  •••] • 

o  o 

l+'t  as(x,y)  (c°)  <f£)  +  bs(x,y)  (jj)  +  c°]  = 

(C. 1-21) 

rV  ^  a»(x->,)  (c8>2  (§>  +  bs(*-y>  O  +  % 

o  o 

+  [a°  -  a'°  -  a(x,y)  +  af(x,y)J  a0  c°) 

o 

The  result  may  be  used  to  obtain  Equation  (1.4-11). 

It  is  of  interest  here  to  investigate  the  magnitude  of  the  terms 

x/r,  x/r',  y/r  and  y/r'  .  In  the  vicinity  of  the  edge  of  the  active 

scattering  area  which  is  roughly  the  first  Fresnel  zone  (see  Figure 

C.l-2)  these  terms  take  on  their  maximum  values.  If  the  points  P  and 

Q  are  located  at  the  same  depth  r  s  r’  2  R{j/Cos(^)  where  is  the 

direct  or  line-of-sight  distance  between  P  and  Q  .  For  moderate 

grazing  angles  the  value  of  x  at  the  semi-major  axis  edge  of  the  first 

Fresnel  zone  is  approximately  given  by  [/x  R{j]/Sin(^)  The  terms  of 

interest  are  then  of  the  order  of  magnitude  [ /x/R^ ]  Ctnf^)  and  are 

negligible  if  Equation  (1.4-9)  holds.  In  a  similar  manner  the  correction 

terms  in  Equations  (C.l-8)  and  (C.l-11)  are  of  order  [/x/R,]^  z 

dp 

4  3 

Cos  ( / [ A  Sin  (^  )  j  and  can  be  ignored  to  the  same  extent. 
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The  First  Fresnel  Zone 


Figure  C.l-2 


APPENDIX  D 


REFINEMENTS  IN  THE  CRITERIA  FOR  DETECTABILITY 


D.l  Peak  Displacement  Corrections  for  the  Combined 
Correlator  Detector-Tracker 

In  section  3.8  we  examined  the  problem  of  detecting  a  target  when 

the  replication  delay  parameter  t  Is  "steered"  to  the  (known)  "on 

target"  condition  (t  ■  In  most  practical  applications,  however, 

the  parameter  t  is  not  known  a-priori.  For  this  reason  many  standard 

detectors  employ  a  kind  of  composite  hypothesis  test  or  detector-tracker 

scheme  which  emulates  the  mode  of  operation  of  optimum  maximum  likelihood 
69 

processors.  In  this  implementation  we  envisage  a  large  number  of 
parallel  correlator  detectors  each  with  a  different,  fixed  value  of  t 
covering  the  range  of  possible  values  for  .  The  processor  outputs 
are  scanned  simultaneously  over  t  for  a  peak  of  correlation. 

As  noted  in  section  3.6  the  peak  of  correlation  is  generally  not 
located  at  the  correct  value  t  .  This  is  true  even  in  the  absence  of 


scattering,  but  the  presence  of  a  certain  amount  of  delay  modulation  in 
most  scattering  models  aggravates  this  problem.  While  the  effect  of 
scatter  delay  modulation  on  tracking  errors  may  be  great,  the  impact  on 
signal  detectability  need  not  be  as  significant.  Hence,  the  "on  target" 
normalized  variances 

A(T  ,t  .T,„)|H, 


d*<T  .I.iO  -  °-  - i  dt(T.,T,p)  -  g  -  (D.l-l) 


o'  o'  '"  ■  o  ,2 


[E(to,T,p)]' 


1'  o’ 


A(WT,W)  IH1 

- 


[5(tq ,T,p)  ] 


are  usuallv  overly  pessimistic  indicators  of  detector  performance. 


It  Is  desirable  to  compensate  for  errors  that  arise  in  the  evaluation 
of  the  correlator  detector  due  to  variation  in  the  location  of  the  peak. 

In  this  section  we  exemine  a  set  of  correction  terms  that  approximately 
perform  this  compensation  and  that  are  valid  in  the  same  limit  as  the 
results  of  section  3.6.  That  is,  we  assume  that  the  processing  Interval 
T  is  large  enough  so  that  the  maximum  deviation  between  the  location  of 
the  peak,  ,  in  any  realization  5(r,T,p)  and  the  mean  location  xq 
is  much  less  than  the  signal  correlation  width.  We  also  assume  sufficient 
signal  bandwidth  limitation  so  that  the  second  derivative  of  5  with 
respect  to  x  exists.  Furthermore,  we  assume  that  5"  is  essentially 
fixed  in  the  vicinity  of  location  of  the  peak  at  its  mean  value,  i.e. 
En(x,T,p)  2  E"(io,Tfp)  .  This  in  turn  Implies  that  the  null  in  E(x,T,p) 
at  ^  i6  well  separated  from  any  neighboring  zeros. 

The  correction  technique  consists  essentially  in  fitting  a  parabola 
to  the  correlation  function  estimate  E(x,T,p)  at  x  ■  xQ  .  The  location 
of  the  maximum  for  this  parabola  is  an  approximant  to  the  location  of  the 
peak  Xp  (see  Figure  D.l-1). 


The  Quadratic  Approximant  to  E 


Figure  D.l-1 
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The  equation  for  the  quadratic  approximant  it 


f (t)  -  b(t-t  )  +  b(T-T  )  +  c  . 

0  o 


«  f  *  (t)  -  2a(T-x  )  +  b 

0 


f"(T)  -  2a  . 


(b.1-2) 

(D.l-3) 

(D.l-4) 


Fitting  this  to  E(x,T,p)  at  t  ■  t  : 

o 


a  -  *3E"(to,T,p) 


b  -  “* (t0»T»P> 


c  -  H(tq,T,p) 


The  approximant  has  its  peak  at  the  location 

E'(t  ,T,p) 


fp  "  To  "  E"(to,T,p) 


(D. 1-5) 
(D. 1-6) 
(D.l-7) 


(D.l-8) 


To  the  extent  that  E"(xo,T,p)  is  approximately  equal  to  its  mean 
value  this  result  is.  equivalent  to  the  discussion  of  section  3.6. 
The  value  of  f(x)  at  the  peak  is 


[-•(t  T,p)r 

£(V  -  -  2r(vT.p)  +  5(VT'p! 


(D. 1-9) 


[H,(To>TfP)r 

25"(to,T,P) 


+  =(x^,T,p) 


It  should  be  noted  that  in  order  for  this  analysis  to  be  meaningful 
the  curvature  of  E(x,T,p)  should  be  toward  the  x-axis  at  i 

o 

Hence,  from  the  following  approximate  result  for  the  mean  height  of 
the  peak 


(D.l-10) 


f(V 


H(tq,T,p) 


[H'(to>T,p)J2 

2  =“(t  ,T,p) 
o 


we  have  approximately 


|f(x  )|  »  |=(to,t,p)|  (d.i-11) 

For  this  analysis  to  be  accurate,  however,  the  correction  term  should  be 
small  compared  with  H(tq,T,p)  . 

Similarly,  the  correction  to  the  variance  becomes 

Var[f(Tp)J  - 

_  [S'(T  ,T,p))2  [='(T  ,T,p)]2  2 

-  E^.T.p))  -  (y-,^  >1>p)  '  - - 

o 

V«r{[E’(t  ,T,p)]2} 

*  V«r[E(T  ,T,p)]  +  - - - a - 

4  {E"(to,T,p)J2 

-  IE(t0,t,p)  -  e(t0,t,p)1  [s'(to,t,p)12 


2  E"(to,T,p) 

(D. 1-12) 

1 - ) 

S"(t0,T,p) 


The  assumption  that  the  correction  to  the  mean  in  Equation  (D.l-10)  is 
small  does  not  guarantee  that  the  corrections  to  the  variance  are  small. 
Indeed,  when  slowly  varying  delay  modulation  dominates  any  amplitude 
modulation  present  in  the  scattering  model,  these  correction  terms 
become  very  significant. 

A  closer  examination  of  Equation  (D.l-12)  reveals  that  in  order  to 
relate  all  of  the  correction  terms  for  the  variance  of  the  detector 
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output  to  the  spectra  of  the  input  signals  and  appropriate  system 
functions  one  must  evaluate  many  integrals.  The  primary  origin  of  this 
comp  cicy  is  the  requirement  for  the  evaluation  of  the  eighth  order 
moment  of  Var{[E* (tq,T,p)]  }  .  Using  the  same  technique  applied  in 
connection  with  the  derivation  of  Equation  (2.1-18)  one  can  show  that 
(2p)-th  order  moments  of  the  increments  dz  (w)  give  rise  under  the 

X 

Gaussian  hypothesis  to 


(2p)  I 

2P  x  p| 


(D.l-13) 


distinct  integrals.  The  presence  of  additive  noise  compounds  this 
difficulty.  For  the  case  of  p  ■  4  there  are  a  minimum  of  105  different 
integrals ! 

However,  for  p  ■  3  there  are  only  15  integrals  for  the  signal- 
only  terms.  Thus,  evaluation  of  the  last  correction  term  in  Equation 
(D. 1-12)  is  at  least  partially  tractable.  When  the  scattering  fluctua¬ 
tions  produce  much  more  amplitude  modulation  than  delay  modulation 
Equation  (D.l-12)  may  be  reduced  approximately  to 


Var[f (t  ) ]  *  Var[n(T  >T,p)] 
P  o 


(D. 1-14) 


i(T  ,T,p)  [='<T  ,T,p))2  +  E(r  ,T,p)  [E'(t  ,T,P>]2 
0  0  0  0 


H"(to,T,p) 


H"(t0,T,p) 


The  quantities  E(x  ,T,p) ,  S"(t  ,T,p)  and  [S'(r  ,T,p)]  can  be 

oo  o 

evaluated  easily  from  integrals  obtained  elsewhere  in  this  report,  but 


the  third  order  moment  E(to,T,p)  [ E * (t^ ,T , p) ]  requires  a  separate 
development . 
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appendix  e 


VARIANCE  INTEGRALS  FOR 
RANDOM  AMPLITUDE  AMD  DELAY  MODEL 
® ^  Multipath  Processor 


The  Integrals  In  (3.3-9)  through  (3.3-13)  are  evaluated  here 
for  the  spectra  (4.3-2),  the  filters  (4.2-3).  and  the  propagation 
model  described  by  (4.1-11),  (4.1-12),  a„d  (4.2-4): 


Il<v)  .  I  'ft*  kl2  Rnl 
w  a  nl 


exp{-u,  l_n)dw 

2  U  2* 


2 

-  kf  P  _ 
1  nl 


(E.l-1) 


n 


lnl 


2.2 


-J  exp(-l/2vzft‘nl) 


In 


where 


(E. 1-2) 


I2(V)- 


eaa(v> 


Rtt(v)) 


dco 


RAA<v>  k 


2 

1  px 


X 


£ 


2  +  2tor  -  Ett(v>J 

xl 


(E. 1-3) 


RttCv)J 


i  "I 
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r 


where 


-L./i.  +1_ 
fti*  7  fil  V 


(E.l-4) 


Similarly, 
[3U'  “  “2|Kn2 


i,(v)-k2rP. 


n 


n2  ! 


exp{-lv2n2n2) 


+  A2  P  /  J}l*  I  exp{-lv2ft2  ) 

(  n  /  2  ix  J 


with 


(E.l-5) 


n 


2n2 


■/ 


4-  •»  "T 

°2  <4 


(E.l-6) 


Also,  applying  (A. 1-12)  and  (3.3-12) 


I4(t,t',v)  - 


U)  (O' 


>(T-T8+Td)-V(T'-Ts+Td)  2  k2  2  2 

1  2  d  x 

.  B* 


ru 


m 


TT 


k?  k2  A?  P2 


Raa(v)  exp  <  -1  [  (w2+(d,2)o 2  -  20X0*  R..(v)ll  du>  du* 

Jj  2tt  2n 

(E. 1-7) 
2 


R*V7‘  R2t(v) 

m 


AA' 


‘4  2  ,  J 

-1  . 

a  -  R  (v) 
m  tt 

4 

1 

[(t-t  +t  ) 
s  d 

► 


where 


+  (T''T8+Td)2l“o  ‘  2(T-T.+T^)(T'-^.+^)  R„(V)1 


8  d' 


»  »  «/  ax  ' 

8  d  TT 


2  2  1 

a  ■  o  +  — r—  +  — i—  + — 

m  t  2  2  2 

n2  2n2  2«2 


(E.  1-8) 
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Finally,  using  (4.2-5)  and  defining  ft  -  l/0 

m  is 


kl  k2  kl  *2  P2 

2  d  c  *x 


°B  **P{_iPf  [(T-T.+T  ): 
®  2^  o  8  u 


(E.l-9) 


Thus,  we  have 


nCt.x'.v)  -  k! 


+  (T'-W >  > 
1  2_2 
“2^  0  lnl 


e 


(E.l-10) 


Ad  raa<v> 


ft2  / 

*  /“«  -  Rtt 


(V) 


+  (t,_t0)21°C;  -  2<T-0  (T'-t.)R„(v) 


o'  '  m 


0  TT 


+  exp 


{-|j  am  -  [t(^’-To)2 


+  (t-v-to)2  J  a2  -  2(v+t'-to)(t-v-to)Rtt(v) 


]} 


where 


T 

0 


■  T 

S 


(E . 1-11) 


is  the  multipath  replication  delay  or  steering  parameter. 
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E.2  Array  Processing 


In  this  section  ve  evaluate  the  integrals  In  (3.5-12)  through 
(3.5-19)  for  the  noise  cross-spectrum  (4.4-1),  the  filters  and  signal 
spectra  of  section  4.2  and  the  fourth  order  and  second  order  statistics 
of  sections  4.1  and  4.3.  Following  the  derivation  of  (E.l-1)  and  (E.l-3), 


P  _Jna_  2  fna 
a  '  f  na  \  _  / 


K  (v) 
a 


2wRAa(v)k;  Px 


(E.2-1) 


n 


na 


V 


K  *  2K.  -  Rta(v) 1 

Rfx 


(E.2-2) 


exp 


+  2[o2a-  *Ta(v)]l  l 


-1 

]  J 


Mc<v) 


2.  k?  P 
f  nc 


-  —  2  2 
2v(,fnc 

- Je 


(E.2-3) 


0 


nc 


KC(V) 


2tt  Ra  (v)  kg  P 
Ac  f  x 


ft  nr 
x  Jr 

^  nfx 


+  2[o  -  R  (v)) 

TC  TC 


(E.2-4) 


+  2(o 


TC 


R  (V)] 

TC 


where 


b) 


(E.2-5) 
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and 


(E.2-6) 


—  2  2 
exp{-2[  +2a22(v,T1T,)ww,+a23(v»T»T,)t‘,,  D  dai  da)'  * 
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-2  [  a^1(vtT,T,)(T-To)Z-022(v,TtT,)(T-T0)(T,-To) 


_*  exp 


la'  (v,t,t')o*  (v.t.t')  -a2  (v,t,t’)] 


^3(V.t.T')(t'-To)21  _R[*](T,V(VfT.) 


^  a^Cv.T^'Ja'  (v,t,t')  -a?9(v,t,T') 


and  finally, 


J3(t 


,t’,v)  .  f  [  e  JU>(v+t')+u'(v-t)]  P2  x  e 


(E.2-10) 


exp{-2  [a^CvjTjT^w^a^CvjT^Mww'+a^lv^jT^dj'2])  du>  du'  - 


P2 

x  exp 

fi2 

x 


-2  [a '  ( v , T , T ' ) (v+r  •  -t  )  “2a. * (v , t , t ' ) (v+t  * -t  ) (v-t-t  ) 
31  O  32  0  0 

+a33(v,T,T,)(v-T-To)2] 
K,(v,T,T')a’  (v,t,t')  -a2  (v,t,t')] 


2(t,v,v+t') 


a31(v,T,T')a^3(v,T,T')  -a32(v,T»T') 


the  replication  time  in  this  case  is 


T  ■  T  ,  -  T  . 

o  sl  82 


(E.2-11) 


Using  these  formulas  we  have 

c  2  2 

n(x,T',v)  «*  P2(  (Lnajl  flfnaj  exp{_lJv2+(v-T+T ')2]fifna^ 

L  p2  n2  2 

X  f 

R.  (v-t+t') (Pna/Px)fifna 
+  Aa _ x 

n„a  nx/X^0?a-Ht.(v-t+T’)1 

fx 


(E.2-12) 
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+ 


RAc(v+t')  (pnc/Px)ftfnc 


x 


exp{-l  (v-T  +T')2f  1_  2[a 

0  0  A  + 


(» 


2 

fx 


2 

r 

ta 


-1 


RIC(V+T 


■>]]  -l(-v)24c} 


RAr(T“v) (Pnc/Px)Qfnc 

+  — — . . .  . .  x 

°nc  Rtc(t-v)] 

flfx 


-1 


exp{-l_(T-T  -v) 
2  0 


2  [o 


n 


fx 


xa 


Rtc(t-v) ) 


j 


-x(v+T')2n2  ) 

2  £nc 


+  -T  R^(t,V,V+t')  X  [ 
"x  1>2 
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-l[a;i(v,T,t’)v2-2ai2(v,T,T»)v(T-T«-v) 
exp  +a*3(v,T,T,)(T-T,-v)2  ] 

[a^1(v,T,T,)a|3(v,T,Tr)  -  a22(v,T,T')] 

/  a’^v.T.Oa^v.T.T*)  -  a22(v,T,T,> 


“It°2i^v»T»T,HT"To^  “2a22^v,T,T,^T‘’To^T  “To^ 

+a23^V*T,T'^  ^T,”To^2^ 

2 

^°21<v,T*T,)a23<v*T’T,>  “  a22^v,T*T'^ 


/  a^CVjTji'Ja’  (v,t,t*)  -  a22(v,T,T*) 


-j:(a31(v,T,T,)(v+T,-To)  -2a32(v,T,T,)(v+T,-To)(V“T“To) 

+a^3(v,T,T,)(v-T-T£))2  ] 

2 

(a'  (v,T,T’)a*  (v,t,t‘)  -a„(v,T,T') ] 


appendix  g 


TABULATION  OP  THE  FIRST  FEW  G 

n,;,m 


The  following  is  a  tabulation  of  the  first  few  coefficients 

Gn,»,m(p><I'T)  °f  the  polynomials  defined  by 

equation 

(5.2-19).  All  unlisted  coefficients  for  a  given  polynomial 

are  zero. 

po,o,o^",u',T^ 

gc,o.c*0,0,t*  "  1 

(G.l-1) 

go,i,o(0»°*t>  ■  1 

°h  (1  -  <=h(t)) 

(G. 1-2) 

go,i,o(1»0,t)  "  1 

(G.  1-3) 

**0,0, l^**0' ,T^ 

G0,0,1^G,G,T^  c  G0  ^  q(0,0,t) 

(G.l-4) 

G0,0,l(0pl»T>  "  G0,1,0^1,0,T^ 

(G. 1-5) 

pi,o,o(“-“'iT) 

p.(t) 

G1  n  *  h 

*  *  2  2 

ffh  (1  -  ph(t)> 

(G. 1-6) 
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P0.1.1(|I,’U'»T) 


C0.l.l<1*1*,>  *  1 


(C.l-7) 


Gq  .  .(1,0,  )  •  G.  ,  .(0,1,  )  -  - - r 

0,1,1  0,1,1  o‘  (1  -  p£(t» 


°0,l,l^0,0,t^ 


P2.0.0(m’“'’t) 


(1  -  P*(t»«J 


G2t0t0*0,0’  ^  "  gq 


G2,0fp(1,°’  )  "  C2,0,0^C*1,  >  “  G0,1,0(0,0,T) 

G2,0i0^*^,t^  m  * 


pi,0 


G1,1,0^^,T^  "  ^  G1,0,0^G,G,T^ 

G1  1  0(°’°’T>  "  G1  0  0<0,0,t)G0  1  o*°'0,T* 

Gl,0,  ^  G1,0,0^,G,T^ 


(G.l-8) 

(G.l-9) 


(G.l-10) 


(G .1-11) 


(G.l-12) 


(G.l-13) 

(G.1-1A) 


(G.l-15) 


B-210^ 


1 


C1.0.1<°»°-»>  ■  «1,1§0<0.0,T>  (C.l-16) 

^0,2 ,  Q  1 »  t) 


C0,2,0^2*®»t)  "  * 

(G.l-17) 

G0,2  "  2Gn  i  n(0»C,r) 

(G. 1-18) 

G0,2,0(0,0»T>  "  Go,l,o(0»°»T)2 

(G.l-19) 

P0,0,2^a),(ti>,T^ 


(G. 1-20) 

C0,0,2(0*1't>  "  Go,2,0(1,0>t) 

(G.l-21) 

G0,0,2(0,0>t)  *  G0,2,0(0,0,t) 

(G.l-22) 
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APPENDIX  H 


VARIANCE  INTEGRALS  FOR 

the  random  sinusoidal  boundary  model 


H. 1  Multipath  Processor 

In  this  section  we  evaluate  (3.3-9)  through  (3.3-13)  for  the 
spectra  (5.3-1),  the  filters  (5.3-2)  and  the  random  sinusoidal  boundary 
discussed  in  sections  (5.1)  through  (5.2): 


IjCvO 


nl  fn2'  nl 


/ T 


T~T 

u  ft 


7 

fnl 


(H.  1-1) 


I3(u) 


Pn2flfn2/nn2 

/TT" 7  “  2 

0  fifn2 


1  «  x  ftfx/ftx 

Rd  7rTrr 

u0x 


(H.  1-2) 


Using  (5.3-8): 

P 


•  c 


I  fol  =  x  x  2  E  2  QV  (n'n'u)  en  Cos (nft  u) 

*(  1  'vr;>  Jo  tsit2  xx  n  5  ! 


eo  oo 

E  E 

*0  m=0 


1  ^ A(l,2n+1)  A(m,2n+1)  "  r(l+n-Hn+r+*s)  1 

t4(n+l)3  1!  m!  rag  T  ( l+m+n+%)  r! 


n+l+m 

E 

p,q«0 


<(n*l)  l-p^(o)  J1+n+"  Gn>1(B(p,q,o)  p2(o)  r 


(H. 1-3) 


RE{ 


1 


2tt  T  2 (n+l+m+r)+l 


c\l  pi**-™2 


0,Jj,n+l+m-p-q-^ 


The  corresponding  expression  for  I^(t  ,t  1  ,o)  is  much  more  difficult 
obtain  in  closed  form  because  of  the  inverse  first  power  of  frequency 
in  the  exponent  of  (5.2-29).  As  in  (5.1-6)  this  is  handled  by 
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convolution  with  the  function  f^Cu)  defined  by  (5.1-8).  Unfortunately, 
the  convolution  must  either  be  executed  numerically  or  taken  in  the 
limit  of  large  distances  from  the  boundary.  In  the  latter  case  the 
convolution  can  be  ignored  as  discussed  in  section  (5.1).  In  the  former 
case  we  write: 


I4(?  *,u) 


Px/  °x 

RjCr+r1)2 


(n,n,u)  en  Cos(nnsu) 


r  “i  .  1  um  W-2"41)  Mm.2n+i)  j  l  i_  nt 
l,m-0  1!  m!  jio  r d+n+m+Js)  r!  p> 


n+l+m 
Z 

qc0 


(H. 1-3) 


((„*!) [l-ph2(u)]>1^n  G  j  (p,q,u)  [p2M]  r  — r-L - - 

*  (2r )  T(n+l+m+r+^) 

i  5  Jj+ljn-r-p-l  _ 

RE  {G^  (**cn  [“fx- J  Criy-t  Q)]  O.^n+in-p  )  ,  *»(,.) 


RE  { 


*3+**n-r-q-m  f^p')  dp  dp' 

0,*s,n+l-q  }i 


Expressions  (H.l-1)  through  (H. 1-3)  can  easily  be  used  in  conjunction 
with  (3.3-8)  to  compute  the  correlator  output  variance.  To  compute 
the  tracking  error  we  need  the  following  derivatives: 


f  P 


■P?.  q3 
nn2  fn2 


32  I^u-t+t') 

F&P“  3 

(H. 1-4) 

[i  =  2Cnfn2(o-T*T')i2J 

[1  ♦  tnfn2(o-T+T')]2J  5/2 

p  nl  f 
♦  .  x  fx  L 

1  -  2[nfx(o-T*T')] 2 

5/2 

0  r 
x  [ 

1  "  [°fX(u-T+T')32 
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and 


3t3t 


iL  VT'T,'U> 

XtT  h 


p2/q2 

X  X 


1 - r- T  L 

Rd(rotro)  n'° 


J*  (n,n,u)  cn  Cos (nd^u) 

nr  1 


00  00 
Z  Z 
1=0  m=0 


1  , A(l,2n+1)  Atm,2n+1)  7  1  j  ntl+m 

[4(n+l)]  1  m  i!  n!  1  ra+n+bVy  7T  1 

r=U  D.Q= 


p*q=o 
(H. 1-5) 


<(n*l)  [l-P*(o)l  )1+"*n  0  lim(p.q.u)  [p^uflr 


(2 *r  r(n+l+m+r+*s) 


RECJcn  G31  dCl  [afx-J  (T-‘"To)]  "  V^n-r-p-l-l,)  ^(dp) 

0,*s,n+m-p 


RE{+jcnG31  (,<cnCafx*j(t'-,,'to)]2  f2Cdl,') 


These  can  readily  be  substituted  into  (5.3-16)  to  obtain 
the  tracking  variance. 
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Appendix  I 

Numerical  Evaluation  of  the  P^CZ) 


I.l  Relation  to  Struve's  and  Weber* s  Functions 
Following  Watson^8  (p.  331,  sec.  10.41,  #3) 


1 

2n  r(a) 


Gn  (h 


*  ) 


I  e‘Zu(l.u2Ja  du 


-  W(i-a)  rft)  (W*-1*  Js.#4%  m-r.a^ro] 


(1. 1-1) 


where  Sv(Z)  is  Struve's  function 


SVC2) 


*  z 

m=0 


t  ,vjn  ,,  ...v+zm+l 
C-i)  tez) _ 

r (m+3/2) f (v+m+3/2) 


(1.1-2) 


and  Yv(Z)  is  Weber's  function  defined  in  general  by 

v  _  Jv(Z)  Cos(vtt)  -  J-v(Z) 

YvW  - Sin  (vir) - 


For  integer  values  of  v  (1.1-2)  becomes 

Y(Z)  =  -MIn  V  («2)k  +  |  knftZ)  J  (Z)  (1.1-4) 

n  71  k«0  K*  it  .  n 

• 

ftZ)n  r  f  wz21k 

■  *  k;0  i'Kk*l).«n.k+l)}  =  C-l)  Y-n(Z) 


where 

00 

Ju(3)  =  ft*)V  t 

(-«2)k 

v  k=0 

k'.r(v+k+l) 

and 

«« ■  S8- 

(1.1-5) 

(1.1-6) 
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is  the  digamma  function.  Again  for  integer  values  of  2  we  have 


<Kn) 


& 


n-1 

-Y  ♦  E 

k-1 


k 


1 


where  Y  ■  0.57721566. 

./e  can  extend  (1. 1-1)  by  successive  differentiation, 
note  the  following 


T(k+1) 

r0c-p+ij 


The  result  is  valid  for  k>-l  and  is  zero  for  k-p  **  -1,-2, 
For  example: 


r  C-l)k  tw2ktl  r(2k+z) 
k.0  r(k+3/2)  r(k-r..3/2)  r(2k+2-p) 


(dl)P  [&*>"  V1*] 


"  (-l)k  C^2)2k+2nr(2k+2v+l) 

k^Q  kfr(n+k+l)  r(2k+2v-p+l) 


[mn  Y_n(2)] 


7f 


"I1  Cn-k-1) !  ftZ)2k  r 
2‘n  k!  r(2k-p+l) 


-j0  (ai)p'v  [*«»>]  (ar)2  Lftz)n 


P! 

v! (p-v) ! 


+  Hi 

k=0 


:<Hk+l)  +  iKn+k+1) >  (-*sZ)2k+2n  r(2k+2n+l) 
k!  (n+k)!  T(2k+2n-p+l)  ZP 
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Cl. 1-7) 

First  we 

(1.1-8) 

-3 

w  •  •  i 

1 

2P 

(1.1-9) 

_1 

ZP 

(I. 1-10) 

(2k+l)  1 

ZP 


(1.1-11) 


Using  (1.1-9)  through  (I. 1-11)  and  (5.3-9) 


OTTiT 


,31 

’13 


<1* 

nr 


**-*19  ) 

0,%,a-lip-V 


|  \iP  e"*u  (Hu^)"a  du  * 


(-l)pvr(-a)  rft)  (3J)P  {  (Wa+J*  [s.a.jjW  -  Y^d)]} 


(1.1-12) 

Provided  a»n*-*s  where  n  is  an  integer,  we  may  use  (i.1-9)  through  (1. 1-11) 
directly.  However,  if  a**n  (1.1-12)  is  indeterminate  because 


S-nV‘}  B  Y-n-^>  e 
(-Dn  J^C*) 

and  T(-n)  «  ±*. 


(1.1-13) 
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1.2  Evaluation  in  the  Special  Case  of  Integer  Values  of  a. 

The  limit  a  ♦  n  in  (1,1-12)  must  bo  approached  with  caution.  Taking 
(1.1-13)  as  a  starting  point  we  have  from  (1.1-2)  using  a  ■  n  -  e 


(n+lj)+e^ 


n-1  (  nk 

k*0  — 


♦  cwn+*e  (-1)" 


E 

k*0 


2k 

(k+l+e) 


and  from  (1.1-3) 


(1.2-1) 


Y  -  (n+^)+e^  = 


J-(n+*i)+e  (a)  (-l)n  SinQre)  -  Jn+%-e(Z) 
(-l)n+1  Cos(ne) 


,  1%  Sin  (ire)  ^^-(n+^+e 
(’1}  toi(vT)  (W 


(-nk  m2k 

r(k+l)  rt-n-ij+e+k+l) 


n 


+  HO 

Cos(ne) 


(W 


(n+*0-e 


k*0 


(-l)k  W)21< 

T(k+1)  r(n+Ve+k+l) 


(1.2-2) 


Now 

m* 


c£n(«) 

=  1  +  e  tnPsZ)  . .  • 


r(-n+e) 


(-I)”  * 

r(l+n-e)  Sin(we) 


(1.2-3) 


(1.2-4) 


B-218 


and 


T&EJ  5  TOT  -1  '  *(a)£} 


(1.2-5) 


where  a  t  0,-1, -2,...  Substituting  (1.2-3)  through  (1.2-5)  into 
(1.2-1)  and  (1.2-2)  we  have 


lim  V(-n+e)  rft)  ft*)nt,s*c 
e*0 


[S-(n+W.et«  -  y-(n.«.c(2>] 


n-1 

£ 

k-0 


(-i)k 

r  (k+3/2) 


CW2k-(n+5*)+1  r(-k4*n)  Sin(7re)(-l)k‘n+1 


4  Cl+*to«3  c-d"  jo  r^drfeiT  &•«  *»♦«) 


♦1$$  M-(ntWt£  “  w2k 


k‘0  r(k+i)  r(-n-%.k.i) 


-  (W"^  1-e  InftS)  (-!)"  ^  ^ ^  f,H  ♦(k*w*l)] 


-nk  (ux\2k 


n! 


(W 


^'n'"  2  »  <«2>k+,s 


4  *»<W  W>  4  t-1’"  (« 

-  (Wn4ls  k'!fr(iwU.i)  6(k.l).*(k+n+)j.l)] 


(1.2-6) 
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Finally, 


in  PfnV  G13  (^r 


e"*u 

0+uV 


du 


n-1 
-  I 
k*0 


r(-k-fn)  T(2k+2)  Oil2) 
ftw/jy  t (ik+aipj  2P — 


(1.2-7) 


+ 


Nw]  (ai)p  [(wn  j„w] 


p! 

vl  (p-v) • 


C-i)n»  r 


C-W2)k 

k=0  k !  T  (-n-^+k+1) 


T(2k+1) 

TliiSpi 


pTlTIP 


-  £ 
k«0 


(- 1) k  (W  2k+n+llr  (2k+n+3/ 2)  (k+1)  +i|/  (k+n+3/2)l 

k !  rTn% .  ■  ZP - 


In  computing  the  digamma  function  for  fractional  arguments  the  following 
is  useful: 


iKn+*s) 


n 

-y  -2 fcn 2  +  2£ 

k*l 


1 

WT) 


(1.2-8) 
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1.3  The  Asymptotic  Expansion 


For  large  Z  the  sories  in  the  two  proceeding  sections  converge  too 
slowly  to  be  useful.  Fortunately,  a  simple  asymptotic  formula  can  be 
used  for  all  values  of  a.  Equation  (1.1-8)  must  be  altered,  however, 
to  accomodate  k<-l: 


(T.3-1) 


Using  this  we  obtain 


CO 

|  UP  e’Zu  (l+u2)“a  du  = 


00 

z 

m=0 


C-l)m  Ca)m(2m)l 
in!  Z2m+1 


(I. 3-2) 


Z 

m»0 


(-l)m*p(q)m  r(P+2m+l) 
m!  Z2m+1’P  " 
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Appendix  J 

Extensions  to  Large  Arrays 
J.l  The  N  Element  Cross  Correlator 

The  analysis  of  section  4.3  and  4.4  for  a  two  element  cross¬ 
correlator  is  sufficiently  general  to  cover  larger  arrays.  In  this 
section  we  consider  configuration  shown  in  figure  (J.l-1) 


n,  (t) 


Figure  J.l-1 

We  consider  the  receivers  to  be  uniformly  spaced  in  a  linear  array 
with  steering  delays  that  are  multiples  of  a  common  steering 
parameter  t : 

T.  n  (i-l)T  ji=l  ...,n]  (J.l-1) 
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The  filtered  output  of  the  summer  is  thus 


y(t)  *  "  (2?  | (»)♦«.  (u,t-T1)d«(«)]  e'j“Ti 


OJ 


(J.l-2) 


One  can  readily  show  that 


N  N  r 

S{T,t,p)  »  E  I  {X  [  [|  H-(w)  | 2  dl 
i-i  k*l  ™  J  1 


ni\ 


(«) 


k  dZxx(w)"|  e^Tk~Ti^} 

#  f  J 


(J.l-3) 


Furthermore,  the  general  second  order  moment  is  given  by 


A(t,t\T,u)  = 


N  N  N  N 


ls2 


Z  Z  l  Z  {(£) 


i  si  k*l  £«1  m=l 


|  [t+v-u] 

m-t 


m+t 


1 


T-v+y 


{  M.  .(v-t.+t')  M.  (v-t.+t') 
1  ikv  1  k'  inr  l  nr 


*  Mlk  (u-Ti+Tk5  *  Mta(v-Vti)  Kik(v-Ti+Tk) 


*  Mtk(v+Tk-TP  +  Mim(v+T;-Ti5  Ktk^v+,k-TP 


♦  w^i-v  Mik(v+Tk-TP  +  jikL(Tt-Ti-vTk’v+Tk-Ti) 


(2) 

♦  J>.  i  (t#-T.  ,t'-T,'  V+T,'-T..  ,  t(3)  ,  ,  .  \ 

iktm  t  i'  m  k>  k  1)  ♦  CVVV^'^VV  1 


dv 


where  t  and  t*  axe  two  steering  parameters  with 


\  ■  (i-l)t  xj  -  (k-l)T' 


and 


Mik(v)  o  f 


0) 


ej“V  lHfM|2  snAM  ST 


L  (v)  »  (  e^wv  ♦,  .  t  s  _  f  <.  dio 

ik  J  (w,(u,v)  S^Cw) 


Jiklm(v'T>T,:i 


|  J  ej  [WV+W'CV-T+T*)^ 


CJ  (t)' 


(4) 


n  ^  du* 


sxxMSxxcu-)  H  s 


42L(v',-t')  = 


i  (wT+to’T*] 


(I)  u 


,(4) 


ifkf£fmf(a3,-(i)',-(D,w',v,T,v+T,|  S^CuOS^Cw')  ~ 


jikL  ■ 


j  Ju(v+T  * )  +U>*  (V-t)J 


(1)  w' 


t')  SxxM  S^O.')  g 

(4) 

The  fourth  order  moments  4>  '  are  formed  by  anology  with  ( 


(4) 


H.(a),t)  Hk(to',t-y)*  Hl(u»",t-ii»)  HJw*  "  ,t-y •  •)* 


(J.l-5) 

(J.l-6) 

(J.l-V) 

(J. 1-8) 

(J. 1-9) 

.5-4)  by 
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(J.l-10) 
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Signals  reflected  from  the  sea  surface  undergo  distortion  which  limits  their  detectability 
and  usability  for  tracking.  Two  propagation  geometries  are  analysed.  The  first  deals 
with  the  crosscorrelation  of  surface-reflected  and  direct  transmission  paths,  and  the  second 
with  the  crosscorrelation  of  surface -sc altered  signals  received  at  two  different  locations. 
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both  gaussian  random  variables.  Three  models  of  the  scattering  mechanism  are  proposed 
and  two  are  analysed  in  detail.  In  all  cases  the  correlator  output  is  shown  to  exhibit 
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